A Lower Bound for Comparison-Based Sorting
Explore the fundamental lower bound for comparison-based sorting algorithms, understanding why no deterministic or randomized method can sort n distinct elements faster than n log n comparisons. Learn the proof concepts using comparison trees and permutation theory, reinforcing the optimality of algorithms like merge-sort and heap-sort.
We'll cover the following...
We have now seen three comparison-based sorting algorithms that each run in time. By now, we should be wondering if faster algorithms exist. The short answer to this question is no. If the only operations allowed on the elements of a are comparisons, then no algorithm can avoid doing roughly comparisons. This is not difficult to prove, but requires a little imagination. Ultimately, it follows from the fact that
We will start by focusing our attention on deterministic algorithms like merge-sort and heap-sort and on a particular fixed value of n. Imagine such an algorithm is being used to sort n distinct elements. The key to proving the lower bound is to observe that, for a deterministic algorithm with a fixed value of n, the first pair of elements that are compared is always the same. For example, in heap_sort(a), when n is even, the first call to trickleDown(i) is with i = n/2 − 1 and the first comparison is between elements a[n/2 − 1] and a[n − 1].
Since all input elements are distinct, this first comparison has only
two possible outcomes. The second comparison done by the algorithm
may depend on the outcome of the first comparison. The third comparison may depend on the results of the first two, and so on. In this way, any deterministic comparison-based sorting algorithm can be viewed as a rooted binary comparison tree. Each internal node, , of this tree is labelled with a pair of indices u.i and u.j. If a[u.i] < a[u.j] the algorithm proceeds to the left subtree, otherwise it proceeds to the right subtree. Each leaf w of this tree is labelled with a permutation ...