Related Tags

algorithms
python
strings

# Longest palindromic substring in O(n) with Manacher's Algorithm

## Problem statement

A palindrome is a string that reads the same backward or forwards. For example, madam or racecar are some famous palindromic strings. We are interested in finding the length of the longest palindromic substring in a string. For example, although the string abaccab has multiple palindromic substrings, like aba and acca, the longest palindromic substring in it is baccab. This is what we are interested in finding.

## Intuition behind Manacher’s algorithm

Although there is a brute force and dynamic programming algorithm that can this problem in O($n^3$) and O($n^2$) respectively, Manacher’s algorithm brings down the time complexity of the algorithm to an astounding O($n$). The reason for this improvement is that Manacher’s algorithm very smartly leverages properties of a palindromic string to avoid needless computation. Let’s look at an example to understand these properties.

Consider a string that contains a palindromic substring of odd length centered around index c. Half of the characters of the substring are to the right of c and the other half are to the left. For example, in the illustration below, we can see a palindromic string with a center around c highlighted in a darker shade of green. The second array shows the number of matching characters if we were to find a palindrome centered around a specific character. For example, the palindrome around the first a is cac and only one character from either side match.

Number of matching characters around c is 2

If we wanted to find the palindrome centered around the second a (pointed by the arrow), it would be the same as the first a. This is because we already know the above string is a palindrome around c, and thus, all the characters in either direction are symmetric. We call these two a's to be the mirror of each other. Similarly, both c's on either end are also each other’s mirrors.

Let’s update our example a little. Now, we have a string that includes the palindromic string we discussed above.

cacac is a palindromic substring centered around c in the above string

Now, we cannot say that the palindrome around the a, pointed by the arrow, will be the same as its mirror a, since mirror a's palindrome exceeds the range covered by the palindromic substring cacac. We can, however, say with certainty that the palindrome around a, pointed out by the arrow, would be valid as long as it is in the range of cacac. This means, although the palindrome around mirror a is bigger and exceeds the range of cacac, the part that lies within the range would still be in the other a's palindrome.

This was the intuition behind Manacher’s algorithm. Instead of finding the palindrome centered around each character from scratch, we can use the previous results.

## Implementation

We had an assumption that the length of the palindrome would be odd, as we discussed the intuition above. This gives us a nice center of the palindrome, so we can apply the above properties.

What about the cases for even length palindromes? We transform our string by placing a # intermediately before and after each character. For example, deed becomes #d#e#e#d# and madam becomes #m#a#d#a#m#. This also has another benefit; remember the array we were using to count the number of matching characters around a center? It will now have the length of the palindrome around each center. We call this array LPS in our implementation below. C is used for the current center and R is used to quantify the range of the palindrome centered around C.

Next, we iterate over the updated string, and at each iteration, we first use the discussed properties of a palindrome to find the minimum definite length of the palindrome around the i$^{th}$ character (lines 15-16). Then, we build on that value and expand outwards to find the actual length of the palindrome around the i$^{th}$ character (lines 18-19). Lastly, if the current palindrome becomes long enough to exceed the range of the palindrome centered around C, we update the C and R to the current palindrome’s center and range (lines 23-25).

In the end, the max value in the LPS list will denote the length of the longest palindromic substring.

Let’s look at a visualization to understand the algorithm.

Manacher("acaac")
1 of 18

## Code

def UpdatedString(string):
newString = ['#']
for char in string:
newString += [char, '#']
return ''.join(newString)

def Manachen(string):
string = UpdatedString(string)
LPS = [0 for _ in range(len(string))]
C = 0
R = 0

for i in range(len(string)):
iMirror = 2*C - i
if R > i:
LPS[i] = min(R-i, LPS[iMirror])
else:
LPS[i] = 0
try:
while string[i + 1 + LPS[i]] == string[i - 1 - LPS[i]]:
LPS[i] += 1
except:
pass

if i + LPS[i] > R:
C = i
R = i + LPS[i]

r, c = max(LPS), LPS.index(max(LPS))
print (string[c - r : c + r].replace("#",""))
return r

print(Manachen("acaac"))

RELATED TAGS

algorithms
python
strings