Introduction
Real-world problems
Here’s a real-world problem that can be solved using the K-way merge pattern:
Merge Tweets In Twitter Feed
We have to implement a module that adds a user’s Tweets into an already populated Twitter feed in chronological order. Let’s assume that userA
just started following userB
. At this point, we want userB
's Tweets to show up in userA
's Twitter feed. We already have a chronologically sorted list of Tweets that will appear on userA
's feed. Our job is to merge it with the list of userB
's Tweets, which are also chronologically sorted.
As input, we will be given two sorted integer arrays, feed
and tweets
. The integers represent the posting time of the Tweets. You are also given the number of elements initialized in both of the arrays, which are m
and n
, respectively.
Note: Assume that
feed
has a size equal tom + n
such that it has enough space to hold additional elements fromtweets
.
After understanding the solution to the K-way merge problem, students should be encouraged to solve the Twitter feed merging problem.
In this guide, we will first look at the solution of the “Kth Smallest Number in M Sorted Lists” problem.
Dry-run templates
This is the implementation of the solution provided for the problem posed in the Kth Smallest Number in M Sorted Lists (Medium) lesson:
from heapq import *def find_Kth_smallest(lists, k):minHeap = []# put the 1st element of each list in the min heapfor i in range(len(lists)):heappush(minHeap, (lists[i][0], 0, lists[i]))# take the smallest(top) element form the min heap, if the running count is equal to k return the numbernumberCount, number = 0, 0while minHeap:number, i, list = heappop(minHeap)numberCount += 1if numberCount == k:break# if the array of the top element has more elements, add the next element to the heapif len(list) > i+1:heappush(minHeap, (list[i+1], i+1, list))return numberdef main():print("Kth smallest number is: " +str(find_Kth_smallest([[2, 6, 8], [3, 6, 7], [1, 3, 4]], 5)))main()
Students may be encouraged to run through the provided solution with the following sample inputs:
Sample Input #1
lists: [2, 6, 8], [3, 6, 7], [1, 3, 4]]
k: 5
Iteration No. | Line No. | minHeap | minHeap Pop: number,i,list | numberCount |
1 | 8-9 | [(2, 0, [2, 6, 8])] | - | - |
2 | 8-9 | [(2, 0, [2, 6, 8]), (3, 0, [3, 6, 7])] | - | - |
3 | 8-9 | [(1, 0, [1, 3, 4]), (3, 0, [3, 6, 7]), (2, 0, [2, 6, 8])] | - | - |
1 | 14-15 | [(3, 0, [3, 6, 7]), (2, 0, [2, 6, 8])] | 1,0,[1, 3, 4] | 1 |
1 | 16-20 | [(2, 0, [2, 6, 8]), (3, 0, [3, 6, 7]), (3, 1, [1, 3, 4])] | 1,0,[1, 3, 4] | 1 |
2 | 14-15 | [(3, 0, [3, 6, 7]), (3, 1, [1, 3, 4])] | 2,0,[2, 6, 8] | 2 |
2 | 16-20 | [(3, 0, [3, 6, 7]), (3, 1, [1, 3, 4]), (6, 1, [2, 6, 8])] | 2,0,[2, 6, 8] | 2 |
3 | 14-15 | [(3, 1, [1, 3, 4]), (6, 1, [2, 6, 8])] | 3,0,[3, 6, 7] | 3 |
3 | 16-20 | [(3, 1, [1, 3, 4]), (6, 1, [2, 6, 8]), (6, 1, [3, 6, 7])] | 3,0,[3, 6, 7] | 3 |
4 | 14-15 | [(6, 1, [2, 6, 8]), (6, 1, [3, 6, 7])] | 3,1,[1, 3, 4] | 4 |
4 | 16-20 | [(4, 2, [1, 3, 4]), (6, 1, [3, 6, 7]), (6, 1, [2, 6, 8])] | 3,1,[1, 3, 4] | 4 |
5 | 14-15 | [(6, 1, [3, 6, 7]), (6, 1, [2, 6, 8])] | 4,2,[1, 3, 4] | 5 |
Sample Input #2
Students can be asked to complete the table below for the following input:
lists: [[2, 2, 2], [2, 2, 2], [2, 2, 2]]
k: 6
Iteration No. | Line No. | minHeap | minHeap Pop: number,i,list | numberCount |
1 | 8-9 | [(2, 0, [2, 2, 2])] | - | - |
2 | 8-9 | [(2, 0, [2, 2, 2]), (2, 0, [2, 2, 2])] | - | - |
3 | 8-9 | [(2, 0, [2, 2, 2]), (2, 0, [2, 2, 2]), (2, 0, [2, 2, 2])] | - | - |
1 | 14-15 | [(2, 0, [2, 2, 2]), (2, 0, [2, 2, 2])] | 2,0,[2, 2, 2] | 1 |
1 | 16-20 | [(2, 0, [2, 2, 2]), (2, 0, [2, 2, 2]), (2, 1, [2, 2, 2])] | 2,0,[2, 2, 2] | 1 |
2 | 14-15 | [(2, 0, [2, 2, 2]), (2, 1, [2, 2, 2])] | 2,0,[2, 2, 2] | 2 |
2 | 16-20 | [(2, 0, [2, 2, 2]), (2, 1, [2, 2, 2]), (2, 1, [2, 2, 2])] | 2,0,[2, 2, 2] | 2 |
Step-by-step solution construction
Initially, we need to add the first element of each list to the minHeap
.
In the following code, on line 12, we take the first element of the list lists[i]
and push it on to minHeap
along with its index, 0
, in the lists[i]
, and the lists[i]
itself.
from heapq import *def find_Kth_smallest(lists, k):print("Looking for the " + str(k) + "th smallest number in " + str(lists))minHeap = []# put the 1st element of each list in the min heapprint("For each list: push the 1st element, its index and the list iteself on to the min heap")for i in range(len(lists)):heappush(minHeap, (lists[i][0], 0, lists[i]))print("\tmin heap: " + str(minHeap))return -1def main():lists = [[[2, 5, 8], [3, 6, 7], [1, 3, 4]],[[2,2,2], [3, 3, 3], [2, 3, 4],[1,2]]]targets = [7,4]for i in range(len(lists)):result = find_Kth_smallest(lists[i], targets[i])print("\nThe "+str(targets[i])+"th smallest number in "+ str(lists[i]) + " is : " + str(result))print("-"*100)main()
The next step is to pop minHeap
to get the lowest element. So far, we have added elements, from each of the input lists.
When we perform the first pop, we will get the first minimum of the provided sorted lists.
We can then push the next element from the same array, to which this smallest element belonged, to the heap. If we continue this pop-push pattern, we can do a sorted traversal of all elements. The code below does this job for us:
from heapq import *def find_Kth_smallest(lists, k):print("Looking for the " + str(k) + "th smallest number in " + str(lists))minHeap = []# put the 1st element of each list in the min heapprint("For each list: push the 1st element, its index and the list iteself on to the min heap")for i in range(len(lists)):heappush(minHeap, (lists[i][0], 0, lists[i]))print("\tMin heap: " + str(minHeap))print("Set-up complete. About to start pop-and-push cycle:\n")# take the smallest(top) element form the min heapnumber = 0while minHeap:number, i, list = heappop(minHeap)print("\tPopped from the heap: ", end="")print("\tnumber, i, list: " + str(number) + ", " + str(i) + ", " + str(list))# if the array of the top element has more elements, add the next element to the heapif len(list) > i+1:heappush(minHeap, (list[i+1], i+1, list))print("\tMin heap after push: " + str(minHeap))return numberdef main():lists = [[[2, 5, 8], [3, 6, 7], [1, 3, 4]],[[2,2,2], [3, 3, 3], [2, 3, 4],[1,2]]]targets = [7,4]for i in range(len(lists)):result = find_Kth_smallest(lists[i], targets[i])print("\nThe "+str(targets[i])+"th smallest number in "+ str(lists[i]) + " is : " + str(result))print("-"*100)main()
Line 17 pops the minimum number, along with its index
in the list, and then the list
itself. Line 22 pushes the next element from the list to which the popped element belonged, on to minHeap
.
The code above returns the largest number in sorted arrays as we keep pushing the next numbers and returns only when we have popped all the elements of the provided arrays. We can modify the code to return when we have popped the element from minHeap
. We can do this by keeping track of the number of pops from minHeap
.
In the code below, on line 21, we increment numberCount
to keep track of the pops so far. We break out of the loop on line 25 if numberCount
equals k
.
from heapq import *def find_Kth_smallest(lists, k):minHeap = []print("Looking for the " + str(k) + "th smallest number in " + str(lists))# push the 1st element of each list on to the min heapprint("For each list: put the 1st element, index and list iteself in the min heap")for i in range(len(lists)):heappush(minHeap, (lists[i][0], 0, lists[i]))print("\tmin heap: " + str(minHeap))print("Set-up complete. About to start pop-and-push cycle:\n")# take the smallest(top) element form the min heap, if the running count is equal to k return the numbernumberCount, number = 0, 0while minHeap:number, i, list = heappop(minHeap)print("Iteration: "+ str(numberCount+1))print("\tPopped from the heap: ", end="")print("\tnumber, i, list: " + str(number) + ", " + str(i) + ", " + str(list))numberCount += 1print("\t\tnumberCount: " + str(numberCount))if numberCount == k:print("Solution FOUND: ", str(number), ". Popped ", str(k), " times.", sep="")break# if the array of the top element has more elements, add the next element to the heapif len(list) > i+1:heappush(minHeap, (list[i+1], i+1, list))print("\tMin heap after push: " + str(minHeap))return numberdef main():lists = [[[2, 5, 8], [3, 6, 7], [1, 3, 4]],[[2,2,2], [3, 3, 3], [2, 3, 4],[1,2]]]targets = [7,4]for i in range(len(lists)):result = find_Kth_smallest(lists[i], targets[i])print("\n"+str(targets[i])+"th smallest number in "+ str(lists[i]) + " is : " + str(result))print("-"*100)main()