Grammar Error Correction with Transformer (GECToR)

Get introduced to GECToR, Grammarly's solution to both grammar and spellcheck. Understand how GECToR functions in the context of NMT.

What is GECToR?

GECToR (Grammar Error Correction with Transformer) is an open-sourced transformer-based deep-learning model created and maintained by Grammarly. The system is pre-trained on synthetic errorful data (a corpus containing sentences with spelling or grammar errors, along with corresponding tagged corrections) and then fine-tuned in two stages: first on errorful data, and second on a combination of errorful and error-free parallel corpora.

This mathematical fine-tuning process ensures that GECToR adapts its knowledge to the specific nuances of the target domain, resulting in improved performance for grammar correction in that particular context, and is done similarly to our regular training, utilizing an objective function as well as a custom loss function.

The synthetic data is developed using a methodology called g-transformations. These are task-specific operations such as merging the current token and the next token into a single one, conversion of singular nouns to plurals and vice versa, or even changing the form of regular/irregular verbs to express a different number or tense.

How is GECToR different than standard transformers?

Specifically, GECToR is considered a "tag, not rewrite" model. This involves labeling each word or sub-word token in the input sequence with information related to grammatical correctness. This allows the model to go beyond simply identifying errors and provides detailed information about the nature of each error.

Get hands-on with 1400+ tech skills courses.