Strong Components in Linear Time Continued

Understand how these techniques can improve the efficiency of the algorithm.

Tarjan’s algorithm

An earlier but considerably more subtle linear-time algorithm to compute strong components was published by Bob Tarjan in 1972. Intuitively, Tarjan’s algorithm identifies a source component of GG, “deletes” it, and then “recursively” finds the remaining strong components; however, the entire computation happens during a single depth-first search.

Characterizing sink components in a directed graph

Fix an arbitrary depth-first traversal of some directed graph GG. For each vertex vv, let low(v)low(v) denote the smallest starting time among all vertices reachable from vv by a path of tree edges followed by, at most, one non-tree edge. Trivially, low(v)v.prelow(v) ≤ v.pre, because vv can reach itself through zero tree edges followed by zero non-tree edges. Tarjan observed that sink components could be characterized in terms of this low function.

Lemma: A vertex vv is the root of a sink component of GG if and only if low(v)=v.prelow(v) = v.pre and low(w)<w.prelow(w) < w.pre for every proper descendant ww of vv.

Proof: First, let vv be a vertex such that low(v)=v.prelow(v) = v.pre. Then there is no edge wx,w\rightarrow x, where ww is a descendant of vv and x.pre<v.prex.pre < v.pre. On the other hand, vv cannot reach any vertex yy such that y.pre>v.posty.pre > v.post. It follows that vv can reach only its descendants, and therefore any descendant of vv can reach only descendants of vv. In particular, vv cannot reach its parent (if it has one), so vv is the root of its strong component.

Now suppose in addition that low(w)<w.prelow(w) < w.pre for every descendant ww of vv. Then each descendant ww can reach another vertex xx (which must be another descendant of vv) such that x.pre<w.prex.pre < w.pre. Thus, by induction, every descendant of vv can reach vv. It follows that the descendants of vv comprise the strong component CC whose root is vv. Moreover, CC must be a sink component because vv cannot reach any vertex outside of CC.

On the other hand, suppose vv is the root of a sink component CC. Then vv can reach another vertex ww if and only if wCw ∈ C. But vv can reach all of its descendants, and every vertex in CC is a descendant of vv, so vsv’s descendants comprise CC. If low(w)=w.prelow(w) = w.pre for any other node wCw ∈ C, then ww would be another root of CC, which is impossible.

Computing low(v)low(v) for every vertex vv via depth-first search is straightforward; see the algorithm below.

The lemma above implies that after running FindLowFindLow, we can identify the root of every sink component in O(V+E)O(V + E) time (by a global whatever-first search) and then mark and delete those sink components in O(V+E)O(V + E) additional time (by calling whatever-first search at each root) and then recurse. Unfortunately, the resulting algorithm might require VV iterations, each removing only a single vertex, thus naively giving us a total running time of O(VE)O(V E).

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy