# Pumping Theorem for Regular Languages

Learn about the pumping theorem for regular languages.

## Infinite regular languages

We know that a decision algorithm can be developed to detect whether a DFA’s language is infinite. If a DFA accepts a string of length greater than or equal to $p$, where $p$ is the number of states in the DFA, we know there is a cycle in an accepting path, so the language is infinite. Therefore, we could test all strings in $\Sigma^*$ of length $p,p+1,p+2...$ for acceptance. The question is: when can we stop and give an answer? Since no cycle can contain more than $p$ states, we only have to test strings of the following $p$ lengths: $p,p+1,p+2,...,2p-2,2p-1$.

If a language accepts a string of length $2p$, we can ignore one of its cycles, which is no longer than length $p$, meaning that a string whose length is in the range $[p,2p-1]$ must also be accepted.

Remember:To determine by computer whether the language of a finite automaton is infinite, it is sufficient to test all strings in $\Sigma^+$ with lengths in the range $[p,2p-1]$ for acceptance.

As stated earlier, we could also convert the language’s automaton to a regular expression. If a Kleene star is present, then the language is infinite.

Note:To determine whether the language of a finite automaton is infinite, convert the automaton to a regular expression and look for a Kleene star.

## Ideas behind the pumping theorem for infinite regular languages

Suppose a regular language, $L$, accepts a string, $s$, where $|s| \geq p$, and where $p$ is the number of states in its associated minimal DFA. We know from the pigeonhole principle that by the time we have read the $p$-th symbol of $s$, a cycle has been found in an accepting path. Let’s call the substring representing the *first* cycle found $y$ (traversing the cycle *once*), and the substring leading from the initial state to the first state of the cycle $x$ (which could be empty). Then we can represent $s$ as the concatenation of three strings, $xyz$, where $x$ and $y$ are just as described, and $z$ represents whatever is left in the string after $y$ all the way to where the string ends in an accepting state. The string $z$ may contain other traversals of $y$’s cycle and any other edges and cycles that may occur in the string after $y$. Like $x$, $z$ may also be empty.

Get hands-on with 1200+ tech skills courses.