Longest palindromic subsequence algorithm
The longest palindromic subsequence is the longest sequence of characters in a string that spells the same forwards and backward.
A subsequence differs from a substring since characters in a subsequence are not required to be at consecutive positions in the original string.
Example
Take string "ABBCDABBC", for example. Then the longest palindromic subsequence in this string is "BBABB".
A naive approach would be to find all possible palindromic subsequences in "ABBCDABBC", and filter out the longest one.
Note: However, this approach has exponential time complexity.
A much better solution is to use Dynamic Programming. It is an optimization technique where complex problems are broken down into subproblems, with each subproblem solved only once.
Dynamic Programming solution
We can solve this problem using the solution to another similar problem - the Longest Common Subsequence (LCS) problem. The idea is:
- Reverse the given sequence. Let’s call the original sequence
original. Let’s call the reversed sequencereverse. - Use the LCS algorithm to find the longest common subsequence between
originalandreverse. LetLCS(original, reverse)be a function that returns the longest common subsequence between the pair of strings. - The answer from LCS will in fact be the longest palindromic subsequence.
More specifically, consider the illustration below:
The last characters match for both original and reverse in the figure above, hence we recur on both of them excluding the last character.
Next, consider the case illustrated below:
In the case where the last characters do not match, we have to solve two subproblems:
-
Ignore the last character of
reverse, and recur onoriginaland updatedreverseagain. -
Ignore the last character of
originaland recur on the updatedoriginalandreverseagain.
Select the best result from these two subproblems.
Mathematical form
Now, to combine all the above cases into a mathematical equation:
Lets call the original sequence and reverse as . The prefixes of are
and the prefixes of are
.
Let represent the set of longest common subsequence of prefixes and . Then:
If the last characters match, then the sequence is extended by that matching character . Otherwise, the best result from and is used.
Optimal substructure
The solution to this problem satisfies the overlapping subproblem property, as shown in the diagram below. The diagram is a partial LCS of "PXYM" and "PYQX":
The boxes in purple above are the same subproblem - this means we do not need to solve it twice. Using memoization, overlapping subproblems are solved only once and their result is stored to be used if the subproblem shows up again.
For this problem, a two dimensional array LCS can be made as a memo, where LCS[i][j] represents the length of the longest common subsequence between original with length i and reverse, with length j. After populating the memo using the above algorithm, the longest palindromic subsequence can be reconstructed using backtracking on the memo.
Free Resources