The Levenshtein distance (a.k.a edit distance) is a measure of similarity between two strings. It is defined as the minimum number of changes required to convert string a
into string b
(this is done by inserting, deleting or replacing a character in string a
). The smaller the Levenshtein distance, the more similar the strings are. This is a very common problem in the application of Dynamic Programming.
A pre-requisite for applying Dynamic Programming, to solve a problem, is to demonstrate that the solution to the original problem can be found by using the solutions to the sub-problems.
The approach here is somewhat simple and intuitive. Consider the strings a
and b
up to their last character:
a
.Since DP utilises memory for storing the solutions to sub-problems, we will use a 2D array (D
) for this purpose.
The value in D[i][j]
represents the edit distance if we consider the strings a
and b
, till their i
th and j
th character, respectively. Therefore, the answer to the original problem will be found in D[length of 'a'][length of 'b']
.
#include <iostream> #include <cstring> #include <algorithm> using namespace std; int findMin(int x, int y, int z){ if(x <= y && x <= z) return x; else if(y <=x && y <= z) return y; else return z; } void freeMemory(int** arr, int a, int b){ for(int i = 0; i < a; i++){ delete[] arr[i]; } delete[] arr; } int findDistance(char a[], char b[]){ // Declaring a 2D array on the heap dynamically: int len_a = strlen(a); int len_b = strlen(b); int** d = new int*[len_a + 1]; for(int i = 0; i < len_a + 1; i++) d[i] = new int[len_b + 1]; // Initialising first column: for(int i = 0; i < len_a + 1; i++) d[i][0] = i; // Initialising first row: for(int j = 0; j < len_b + 1; j++) d[0][j] = j; // Applying the algorithm: int insertion, deletion, replacement; for(int i = 1; i < len_a + 1; i++){ for(int j = 1; j < len_b + 1; j++){ if(a[i - 1] == b[j - 1]){ d[i][j] = d[i - 1][j - 1]; } else{ // Choosing the best option: insertion = d[i][j - 1]; deletion = d[i - 1][j]; replacement = d[i - 1][j - 1]; d[i][j] = 1 + findMin(insertion, deletion, replacement); } } } int answer = d[len_a][len_b]; freeMemory(d, len_a + 1, len_b + 1); return answer; } int main() { char a[] = "Carry"; char b[] = "Bark"; cout << "Levenshtein Distance: " << findDistance(a, b); return 0; }
RELATED TAGS
View all Courses