Calculating Minimum Edit Distance
Understand how to calculate the minimum edit distance between two strings using the Wagner-Fischer algorithm with dynamic programming. Discover the step-by-step process of building and filling the distance matrix, and apply this method to measure string similarity for NLP applications such as spellchecking and grammatical error correction.
We'll cover the following...
Introduction
The algorithm for calculating the minimum edit distance can be built using either dynamic programming or recursion. Here, we will consider the dynamic programming version of the algorithm, known as the Wagner-Fischer algorithm. If you are unfamiliar with dynamic programming as a whole, a good overview can be found here, as the basics are outside the scope of this course.
Algorithm walkthrough and pseudocode
We start with a matrix representing distances between the characters of the two strings. The matrix has rows and columns, where and are the lengths of the two strings. The first row and column are initialized from to the length of the respective string, as the distance between an empty string and a nonempty string is always equal to the length of the nonempty string.
Here are the steps of the algorithm:
- Initialize a matrix of size with zeros.
- Set the