Phonetic Algorithms
Explore phonetic algorithms used in preprocessing to enhance entity resolution of names and locations. Understand how Metaphone, Double Metaphone, and Beider-Morse encode sounds to handle pronunciation differences across languages and reduce mismatches in data. Gain insight into selecting appropriate phonetic techniques for your entity resolution tasks.
We'll cover the following...
The German “Schwarz,” the English “Shvarts,” and the Russian “Шварц” are spelled differently, but they share a significant similarity—they sound alike. A phonetic algorithm aims to make their strings look alike by encoding a word’s sound.
The list of phonetic algorithms is long. Some work well for a single language, while others try to serve multiple. Some consist of just a handful of rules, and others of several hundred. Some respond with a single encoding, others with a whole array to account for ambiguity. The following three algorithms represent the spectrum for English-centric texts well:
Metaphone: It has a simple rule set limited to English pronunciation. ...