Text Segmentation

Optimizing the text segmentation problem

For our next dynamic programming algorithm, let’s consider the text segmentation problem from the previous chapter. We are given a string A[1..n]A[1 .. n] and a subroutine IsWordIsWord that determines whether a given string is a word (whatever that means), and we want to know whether AA can be partitioned into a sequence of words.

We solved this problem by defining a function Splittable(i)Splittable(i) that returns True if and only if the suffix A[i..n]suffix\space A[i .. n] can be partitioned into a sequence of words. We need to compute Splittable(1)Splittable(1). This function satisfies the recurrence

Splittable(i)={ True if i>nj=1n(IsWord(i,j)Splittable(j+1))otherwiseSplittable(i)=\begin{cases} & \text{ True \hspace{5.8cm}} if\space i > n \\ & \overset{n}{\underset{j=1}{\vee}} \left ( IsWord(i,j) \wedge Splittable(j+1) \right ) \hspace{1cm} otherwise \end{cases}

where IsWord(i,j)IsWord(i, j) is shorthand for IsWord(A[i..j])IsWord(A[i .. j]). This recurrence translates directly into a recursive backtracking algorithm that calls the IsWordIsWord subroutine O(2n)O(2^n) times in the worst case.

But for any fixed string A[1..n]A[1..n], there are only nn different ways to call the recursive function Splittable(i)Splittable(i)—one for each value of ii between 11 and n+1n + 1—and only O(n2)O(n^2) different ways to call IsWord(i,j)IsWord(i, j)—one for each pair (i,j)(i, j) such that 1ijn1 ≤ i ≤ j ≤ n.

Why are we spending exponential time computing only a polynomial amount of stuff?

Each recursive subproblem is specified by an integer between 11 and n+1n + 1, so we can memoize the function SplittableSplittable into an array SplitTable[1..n+1]SplitTable[1 .. n + 1]. Each subproblem Splittable(i)Splittable(i) depends only on the results of subproblems Splittable(j)Splittable(j), where j>ij > i, so the memoized recursive algorithm fills the array in decreasing index order. If we fill the array in this order deliberately, we obtain the dynamic programming algorithm shown below.

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy