Analysis of Merge Sort

Given that the merge function runs in Θ(n)\Theta(n) time when merging nn elements, how do we get to showing that mergeSort() runs in Θ(nlogn)\Theta(n \log n) time? We start by thinking about the three parts of divide-and-conquer and how to account for their running times. We assume that we're sorting a total of nn elements in the entire array.

The divide step takes constant time, regardless of the subarray size. After all, the divide step just computes the midpoint qq of the indices pp and rr. Recall that in big–Θ\Theta notation, we indicate constant time by Θ(1)\Theta(1).

The conquer step, where we recursively sort two subarrays of approximately n/2n/2 elements each, takes some amount of time, but we'll account for that time when we consider the subproblems.

The combine step merges a total of nn elements, taking Θ(n)\Theta(n) time.

If we think about the divide and combine steps together, the Θ(1)\Theta(1) running time for the divide step is a low-order term when compared with the Θ(n)\Theta(n) running time of the combine step. So let's think of the divide and combine steps together as taking Θ(n)\Theta(n) time. To make things more concrete, let's say that the divide and combine steps together take cncn time for some constant cc.

To keep things reasonably simple, let's assume that if n>1n \gt 1, then nn is always even, so that when we need to think about n/2n/2, it's an integer. (Accounting for the case in which nn is odd doesn't change the result in terms of big–Θ\Theta notation.) So now we can think of the running time of mergeSort() on an nn–element subarray as being the sum of twice the running time of mergeSort() on an (n/2n/2)-element subarray (for the conquer step) plus cncn (for the divide and combine steps—really for just the merging).

Now we have to figure out the running time of two recursive calls on n/2n/2 elements. Each of these two recursive calls takes twice of the running time of mergeSort() on an (n/4n/4)–element subarray (because we have to halve n/2n/2) plus cn/2cn/2 to merge. We have two subproblems of size n/2n/2, and each takes cn/2cn/2 time to merge, and so the total time we spend merging for subproblems of size n/2n/2 is 2.cn/2=cn2 . cn/2 = cn.

Let's draw out the merging times in a "tree":

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy