Transformer-Based Spell Check

Learn how transformers can be applied to create a modern spell checker.

Spell correction as machine translation

We know that a transformer can be trained to translate a sentence from a source language to a target language. So, what if we apply a similar approach but instead use misspelled words as the "source language”, and correctly spelled words in our language of choice as the "target language"?

xfspell explanation

xfspell utilizes the GitHub typo corpushttps://github.com/mhagiwara/github-typo-corpus, which is a large-scale multilingual dataset of misspellings containing millions of data points, as a source language (selecting out only English spelling errors), and American English as a target language.

Another interesting change as opposed to what we have used in the previous lessons is instead of being trained on words, xfspell trains on individual characters. This lets it solve problems such as forgotten spaces and hyphens and misspellings caused by bad punctuation.

Get hands-on with 1400+ tech skills courses.