Pumping Theorem for Regular Languages

Infinite regular languages

We know that a decision algorithm can be developed to detect whether a DFA’s language is infinite. If a DFA accepts a string of length greater than or equal to pp, where pp is the number of states in the DFA, we know there is a cycle in an accepting path, so the language is infinite. Therefore, we could test all strings in Σ\Sigma^* of length p,p+1,p+2...p,p+1,p+2... for acceptance. The question is: when can we stop and give an answer? Since no cycle can contain more than pp states, we only have to test strings of the following pp lengths: p,p+1,p+2,...,2p2,2p1p,p+1,p+2,...,2p-2,2p-1.

If a language accepts a string of length 2p2p, we can ignore one of its cycles, which is no longer than length pp, meaning that a string whose length is in the range [p,2p1][p,2p-1] must also be accepted.

Remember: To determine by computer whether the language of a finite automaton is infinite, it is sufficient to test all strings in Σ+\Sigma^+ with lengths in the range [p,2p1][p,2p-1] for acceptance.

As stated earlier, we could also convert the language’s automaton to a regular expression. If a Kleene star is present, then the language is infinite.

Note: To determine whether the language of a finite automaton is infinite, convert the automaton to a regular expression and look for a Kleene star.

Ideas behind the pumping theorem for infinite regular languages

Suppose a regular language, LL, accepts a string, ss, where sp|s| \geq p, and where pp is the number of states in its associated minimal DFA. We know from the pigeonhole principle that by the time we have read the pp-th symbol of ss, a cycle has been found in an accepting path. Let’s call the substring representing the first cycle found yy (traversing the cycle once), and the substring leading from the initial state to the first state of the cycle xx (which could be empty). Then we can represent ss as the concatenation of three strings, xyzxyz, where xx and yy are just as described, and zz represents whatever is left in the string after yy all the way to where the string ends in an accepting state. The string zz may contain other traversals of yy’s cycle and any other edges and cycles that may occur in the string after yy. Like xx, zz may also be empty.

Get hands-on with 1200+ tech skills courses.