The longest palindromic subsequence is the longest sequence of characters in a string that spells the same forwards and backward.
A subsequence differs from a substring since characters in a subsequence are not required to be at consecutive positions in the original string.
Take string "ABBCDABBC"
, for example. Then the longest palindromic subsequence in this string is "BBABB"
.
A naive approach would be to find all possible palindromic subsequences in "ABBCDABBC"
, and filter out the longest one.
Note: However, this approach has exponential time complexity.
A much better solution is to use Dynamic Programming. It is an optimization technique where complex problems are broken down into subproblems, with each subproblem solved only once.
We can solve this problem using the solution to another similar problem - the Longest Common Subsequence (LCS) problem. The idea is:
original
. Let’s call the reversed sequence reverse
.original
and reverse
. Let LCS(original, reverse)
be a function that returns the longest common subsequence between the pair of strings.More specifically, consider the illustration below:
The last characters match for both original
and reverse
in the figure above, hence we recur on both of them excluding the last character.
Next, consider the case illustrated below:
In the case where the last characters do not match, we have to solve two subproblems:
Ignore the last character of reverse
, and recur on original
and updated reverse
again.
Ignore the last character of original
and recur on the updated original
and reverse
again.
Select the best result from these two subproblems.
Now, to combine all the above cases into a mathematical equation:
Lets call the original
sequence $X=(x_1 x_2 \cdots x_m)$ and reverse
as $Y=(y_1 y_2 \cdots y_n)$. The prefixes of $X$ are
$X_{1,2,\cdots,m}$
and the prefixes of $Y$ are
$Y_{1,2,\cdots,n}$.
Let $\mathit{LCS}(X_i,Y_j)$ represent the set of longest common subsequence of prefixes $X_i$ and $Y_j$. Then:
$\small \mathit{LCS}(X_{i},Y_{j}) = \begin{cases} \emptyset & if \:i=0 \:or j=0, \\ \mathit{LCS}(X_{i-1},Y_{j-1}) \hat {} x_{i} & if \: i,j>0 \: \& \: x_{i} = y_{j}\\ \max \big\{ \mathit{LCS}(X_{i},Y_{j-1}), \mathit{LCS}(X_{i-1},Y_{j})\big\} & if \: i,j>0 \: \& \: x_{i} \ne y_{j} \end{cases}$
If the last characters match, then the sequence $\mathit{LCS}(X_{i-1},Y_{j-1})$ is extended by that matching character $x_i$. Otherwise, the best result from $\mathit{LCS}(X_{i},Y_{j-1})$ and $\mathit{LCS}(X_{i-1},Y_{j})$ is used.
The solution to this problem satisfies the overlapping subproblem property, as shown in the diagram below. The diagram is a partial LCS of "PXYM"
and "PYQX"
:
The boxes in purple above are the same subproblem - this means we do not need to solve it twice. Using memoization, overlapping subproblems are solved only once and their result is stored to be used if the subproblem shows up again.
For this problem, a two dimensional array LCS
can be made as a memo, where LCS[i][j]
represents the length of the longest common subsequence between original
with length i
and reverse
, with length j
. After populating the memo using the above algorithm, the longest palindromic subsequence can be reconstructed using backtracking on the memo.