SymSpell: Symmetric Delete Spelling Correction
Explore the SymSpell spelling correction algorithm that enhances speed and accuracy by focusing on deletion operations and prefix trie structures. Understand how SymSpell builds its dictionary, creates a trie for fast lookups, and applies a symmetric delete approach to generate and prioritize correction suggestions effectively.
We'll cover the following...
What is SymSpell?
SymSpell is a correction algorithm similar to the Norvig model, except it is optimized for speed. It can perform spelling correction at up to 1 million times faster than the Norvig model. This allows us to utilize a larger dictionary and search a larger list of corrections (e.g., consider words with an edit distance of 3) without suffering from performance issues.
This is accomplished by simplifying our transformations. In our standard model, we utilize deletes, transposes, replaces, and inserts. SymSpell optimizes this by only using deletion operations and by creating a trie (prefix tree) as a precomputation measure.
Reasoning
The SymSpell algorithm exploits the fact that the edit distance between two terms is symmetrical, meaning we could either:
Generate all terms with an edit distance <2 from the ...