Solution: Top K Frequent Elements

Let's solve the Top K Frequent Elements problem using the Top K Elements pattern.

We'll cover the following

Statement
Solution
- Naive approach
- Optimized approach using top k elements

Statement

Given an array of integers, arr, and an integer, k, return the $k$ most frequent elements.

Note: You can return the answer in any order.

Constraints:

$1$ $\leq$ arr.length $\leq$ $10^{3}$
$-10^{-4}$ $\leq$ arr[i] $\leq$ $10^{4}$
1 $\leq$ k $\leq$ number of unique elements in an array.

Solution

So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach

The naive approach to finding the k most frequent elements involves iterating through each element in the input array and storing their frequencies in a hash map. For each element in the array, the algorithm checks if the element is already in the hash map. If it is not in the hash map, it is added with a frequency of 1. Otherwise, its frequency is incremented.

After counting the frequencies of each element in the array, the algorithm searches through the hash map for the element with the highest frequency. This involves iterating through the hash map, which takes $O(n)$ time. Once the element with the highest frequency is found, it is removed from the hash map and added to a final list of the k most frequent elements.

The time complexity for this approach is $O(k*n)$ , and the space complexity is $O(n+k)$ , where $n$ is the length of the input array.

Optimized approach using top k elements

We can optimize the previous approach by using the top k elements pattern, which is a useful pattern for finding the top k elements of a sequence or collection based on a certain criterion. We first need to know the frequency of each element. We can do this using a hash map. Once we have the frequency map, the min-heap is the best data structure to keep track of the top k frequencies. After storing the frequencies of the elements, we’ll iterate through the hash map, insert each element into our heap, and find the top k frequent elements.

The hash map will store the element as the key, and its corresponding frequency in the array as the value. When inserting elements from the hash map into the min-heap, the following steps are taken:

We’ll store a pair $(a, b)$ as a tuple (this can be in the form of any data structure like an array or a tuple) in the heap where $a$ will be the frequency of the element, and $b$ will be the element itself. This ensures that the elements in the min-heap are always sorted by frequency.
We’ll make sure that if the size of the min-heap becomes greater than k, that is, there are more than k elements in the min-heap, we pop the pair with the lowest frequency element from the min-heap. This ensures that we always have the k maximum frequent elements in the min-heap.

Once we have added the pairs from the hash map to the min-heap, the min-heap will have the pairs with the top k frequencies. We will traverse the min-heap and add the second element of each pair from the min-heap to this new array. This is done since we only need the top k frequent elements, not their respective frequencies. Finally, we will return this new array.

The illustration below shows the whole process:

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.

Getting Started

Two Pointers

Fast and Slow Pointers

Sliding Window

Merge Intervals

In-place Reversal of a Linked List

Two Heaps

K-way merge

Top K Elements

Modified Binary Search

Subsets

Greedy Techniques

Backtracking

Dynamic Programming

Cyclic Sort

Topological Sort

Matrices

Stacks

Graphs

Tree Depth-First Search

Tree Breadth-First Search

Trie

Hash Maps

Knowing What to Track

Union Find

Custom Data Structures

Bitwise Manipulation

Challenge Yourself

Conclusion

Solution: Top K Frequent Elements

Statement

Solution

Naive approach

Optimized approach using top k elements