Statement▼
You are given two integers, m and k, and a stream of integers. Your task is to design and implement a data structure that efficiently calculates the MK Average for the stream.
To compute the MK Average, follow these steps:
Stream length check: If the stream contains fewer than
melements, return-1as the MK Average.Window selection: Otherwise, copy the last
melements of the stream to a separate container and remove the smallestkelements and the largestkelements from the container.Average calculation: Calculate the average of the remaining elements (rounded down to the nearest integer).
Implement the MKAverage class
MKAverage(int m, int k): Initializes the object with integersmandkand an empty stream.void addElement(int num): Adds the integernumto the stream.int calculateMKAverage(): Returns the current MK Average for the stream as described above, or-1if the stream contains fewer thanmelements.
Constraints:
3<= m<=105 1< k*2<m 1<= num<=105 103 calls will be made toaddElementandcalculateMKAverage, at most.
Solution
The algorithm is designed to maintain a moving window of the last m integers from a data stream and compute the average of the middle m - 2k elements, discarding the smallest and largest k elements. It achieves this by using a deque (window) to store elements in insertion order, and three sorted lists—low, mid, and high—to track the smallest k, middle m - 2k, and largest k elements, respectively. Only the mid values contribute to the final average, and their running total is maintained through a variable called mid sum for constant-time average computation.
When a new element is added and the total count is still less than m, it is placed into mid by inserting it at the correct position to keep mid sorted, contributing to mid sum. Once the window reaches size m, a copy of the elements is taken from the deque, sorted into a new list, and then divided into low, mid, and high segments, with mid sum initialized to the sum of the mid list.
For all subsequent insertions:
The oldest element is removed from the window and also from one of the three lists (low, mid, or high) using binary search to locate its position.
If it is removed from mid, mid sum is updated accordingly.
The new element is then inserted into the correct list based on its value relative to the largest element in low or the smallest in high.
If it lies between these boundaries, it goes into the mid and contributes to the mid sum.
After insertion and deletion, the algorithm rebalances the three lists:
If low exceeds size k, its largest value is moved to mid and added to mid sum. If smaller than k, the smallest value from mid is moved to low and subtracted from mid sum.
The same rebalancing logic is applied between mid and high: if high is too large, its smallest element is moved to mid and added to mid sum; if it is too small, the largest value from mid is moved to high and subtracted from the mid sum.
Finally, the average calculation method returns -1 if fewer than m elements have been added; otherwise, it returns the integer division of mid sum by (m - 2k), providing the required MK Average.
Using three lists is more efficient because each list has a fixed role: one always holds the smallest k elements, one the largest k, and one the middle values that matter for the average. This separation makes it easy to know exactly where an element belongs when adding or removing numbers, without scanning or reshuffling the entire window. It also allows the running sum of the middle values to be updated directly whenever elements move in or out. In contrast, using a single list with boundary pointers would require more work to maintain order and update the middle sum, making the process slower and more complicated.
The steps of the algorithm are as follows:
Constructor: The constructor initializes:
The window size
mand cutoffk.A
deque(window) to hold the most recentmnumbers.Three sorted lists:
low(k smallest),mid(middlem-2k), andhigh(k largest).An integer
mid_sumto quickly compute the sum of the middle section.
addElement(num):
If the window is not yet full (
len(window) < m):Simply insert
numintomidusingbisect.insort.Increment
mid_sumbynum.
If the window just reached full size (
len(window) == m):Copy and sort the
melements of thewindowinnums, then partition into three ranges:low= firstk(smallest),mid= nextm−2k,high= lastk(largest).Calculate
mid_sumas the sum of themidlist.
If the window is already full (
len(window) > m):Removal of the oldest element: Remove the oldest element
oldfrom the window and identify which list (low,mid, orhigh) it belongs to using binary search. Once located, remove it from the respective list and, if it was inmid, subtract its value frommid_sum.Insertion of the new element: Insert a number
numinto the correct list, compare it against the boundary values of the current partitions. If it’s less than or equal to the largest element inlow, insert it intolow; if it’s greater than or equal to the smallest inhigh, insert it intohigh. Otherwise, place it inmidand updatemid_sumaccordingly.Rebalancing: After inserting or removing elements, the lists
low,mid, andhighmight become unbalanced. The goal is to ensure that:lowalways contains exactlyksmallest elements,highalways contains exactlyklargest elements, andmidcontains the remainingm - 2kelements.Rebalancing
low:If
lowcontains more thankelements, remove its largest element (the last item), insert it intomidwhile maintaining sorted order, and add its value tomid_sum.If
lowcontains fewer thankelements, remove the smallest element (the first item) frommid, insert it intolowwhile maintaining sorted order, and subtract its value frommid_sum.
Rebalancing
high:If
highcontains more thankelements, remove its smallest element (the first item), insert it intomidwhile maintaining sorted order, and add its value tomid_sum.If
highcontains fewer thankelements, remove the largest element (the last item) frommid, insert it intohighwhile maintaining sorted order, and subtract its value frommid_sum.
calculateMKAverage():
If there are not yet
melements, return-1.Otherwise, return the integer division of
mid_sumby(m-2k), which is the number of elements in themidlist.
Let’s look at the following illustration to get a better understanding of the solution: