Related Tags

medians
sorted
arrays
c++

# How to find the median of two sorted arrays in C++

The median value is the middle-most value in a set of numbers. For instance, the median of $1,2,3,4,5$ is $3$ and the median of $1,3,5,7$ is $(\frac{3+5}{2}) = 4$.

There are several ways to find the median of two sorted arrays. The illustration below depicts one of these ways:

As shown in the diagram above, one way to find the median is to merge two arrays into one sorted array and extract the median from there. However, this solution has $O(n)$ time complexity.

Think about how you can merge two arrays into one sorted array in linear time.

## Improved solution for same sized arrays

This problem has a much better solution, with logarithmic time complexity. This solution uses a divide and conquer programming technique to divide the problem into smaller instances of similar problems.

The illustration below highlights the thought process behind it:

1 of 5

The following code snippet shows how the median of two same-sized arrays can be found in $O(log n)$ in C++.

float medianOfArray(int arr[], int size)
{
if (size % 2 == 0) // if size of array is even
return (arr[size/2] + arr[size/2 - 1])/2.0;
else // size is odd
return arr[size/2];
}

float median(int arr1[], int arr2[], int size)
{
// if there is only one element in each array,
// then median is average of both elements
if (size == 1)
{
return (arr1[0] + arr2[0])/2.0;
}
// if each array has 2 elements then use this formula:
else if (size == 2)
{
return (max(arr1[0], arr2[0]) + min(arr1[1], arr2[1]))/2.0;
}
else
{
float m1 = medianOfArray(arr1,size);
float m2 = medianOfArray(arr2,size);

// if the medians are equal, then the median
// of both of them will also remain the same
if (m1 == m2)
{
return m1;
}
// this is implementation of the algorithm described
// in the illustration above
if (m2 > m1)
{
// checking for even and odd number of elements
// to make sure we get the array pointer
// to the correct position of the median
if (size % 2 == 0)
{
return median(arr1 + size/2 - 1, arr2, size - size/2 + 1);
}
else
{
return median(arr1 + size/2, arr2, size - size/2);
}
}
else if (m1 > m2)
{
if (size % 2 == 0)
{
return median(arr1, arr2 + size/2 - 1, size - size/2 + 1);
}
else
{
return median(arr1, arr2 + size/2, size - size/2);
}
}
}
}

// driver code
int main()
{
int arr1[] = {1,3,5};
int arr2[] = {2,4,6};
int size = sizeof(arr1)/sizeof(arr1[0]);
float medianOfTwoArr = median(arr1, arr2, size);
cout << "Median of the two arrays is: " << medianOfTwoArr;
return 0;
}

### Explanation

• Lines 1-7: The medianOfArray is a utility function that returns the median of the array it is passed.
• Line 42 and 46: In the case where m2 > m1, the recursive call is made with the two sub-arrays (arr1[m1...] and arr2[...m2]).
• Line 53 and 57: In the case where m1 > m2, the recursive call is made with the two sub-arrays (arr1[...m1] and arr2[m2...]).

Now remember, that the assumption we made was that the two arrays are of the same size. However, a further optimization to the above algorithm would solve the problem of different sized arrays as well.

## Optimized solution for different sized arrays

Consider the following illustration, it contains the gist of the solution:

1 of 13

The following is a C++ implementation which also has logarithmic time complexity.

#include <limits.h> // for INT_MIN and INT_MAX

float median(int arr1[], int arr2[], int x, int y)
{
if (x > y)
{
return median(arr2, arr1, y, x);
}
int low = 0;
int high = x;
while(low <= high)
{
int partitionX = (low+high)/2;
int partitionY = (x+y+1)/2 - partitionX;

int maxLeftX = (partitionX == 0) ? INT_MIN : arr1[partitionX-1];
int minRightX = (partitionX == x) ? INT_MAX : arr1[partitionX];

int maxLeftY = (partitionY == 0) ? INT_MIN : arr2[partitionY-1];
int minRightY = (partitionY == y) ? INT_MAX : arr2[partitionY];

// if this median condition is met,
// then we have found the perfect partition.
if ((maxLeftX <= minRightY) && (maxLeftY <= minRightX))
{
if ((x + y) % 2 == 0) // if combined size is even
{
return ((max(maxLeftX, maxLeftY)) + min(minRightX, minRightY))/2.0;
}
else // if combined size is odd
{
return max(maxLeftX, maxLeftY);
}
}
// PartitionX is too far to the right. Move the search to the left.
else if (maxLeftX > minRightY)
{
high = partitionX - 1;
}
// PartitionX is too far to the left. Move the search to the right.
else
{
low = partitionX + 1;
}
}
}

// driver code
int main() {
int arr1[] = {1,3,5};
int arr2[] = {2,4,6,8,10};
int size1 = sizeof(arr1)/sizeof(arr1[0]);
int size2 = sizeof(arr2)/sizeof(arr2[0]);
float medianOfTwoArr = median(arr1, arr2, size1, size2);
cout << "Median of the two arrays is: " << medianOfTwoArr;
return 0;
}

### Explanation

• Lines 5-8: Since the algorithm only involves iterating on the smaller of the two arrays, the function is called again so that arr1 is always the smaller array.
• Lines 13-14: The partitions for both the arrays are calculated using the formula described in the earlier illustration.
• Lines 16-20: maxLeftX, minRightX, maxLeftY, and minRightY are calculated. In the case that there are no elements in a partition, the conditional operator “?” sets the previously mentioned variables to the lowest or highest possible integer value. Hence, this corner case is handled in these lines as well.
• Line 24: This condition is satisfied once the perfect partitions for both arrays are found. The perfect partitions divide the arrays in such a way that every element in the left partition is less than, or equal to, every element in the right partition. If the median condition is not satisfied, then partitionX is adjusted accordingly.

RELATED TAGS

medians
sorted
arrays
c++