Solution: K Closest Points to Origin

Let's solve the K Closest Points to Origin problem using the Top K Elements pattern.

We'll cover the following

Statement
Solution
- Naive approach
- Optimized approach using top K elements:

Statement

Given a list of points on a plane, where the plane is a 2-D array with (x, y) coordinates, find the $k$ closest points to the origin $(0, 0)$ .

Note: Here, the distance between two points on a plane is the Euclidean distance: $\sqrt{x^2 + y^2}$

Constraints:

$1 \leq$ k $\leq$ points.length $\leq 10^3$
$-10^4 <$ x[i], y[i] $< 10^4$

Solution

So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach

When thinking about how to solve this problem, it may help to solve a simpler problem—find the point closest to the origin. This would involve a linear scan through the unsorted list of points, with, at each step, a comparison between the closest point discovered so far and the current point from the list. The point closer to the origin would then continue as the candidate solution. This has a runtime complexity of $O(n)$ as opposed to the naive solution of sorting the points by distance from the origin and picking the first one, which has a complexity of $O(n \log n)$ .

For this reason, when extending the solution to the $k$ closest points to the origin, we’d ideally like to do one scan through the list of points. However, if we have to check the current set of $k$ closest points with the current point under consideration at each step, we’ll end up with a time complexity of $O(n . k)$ .

Optimized approach using top K elements:

Through a little organization of our set of $k$ closest points, there is a way to reduce the number of comparisons at each step: By storing these points in a max-heap that is sorted on the distance from the origin, we get points in a max-heap that is sorted on the distance from the origin, we get $O(1)$ access to the point, among these $k$ points, that is farthest from the origin.

Now, instead of comparing all $k$ points with the next point from the list, we simply compare the point in the max-heap that is farthest from the origin with the next point from the list. If the next point is closer to the origin, it wins inclusion in the max-heap and ejects the point it was compared with. If not, nothing changes.

In this way, at every step of the scan through the list, the max-heap acts like a sieve, picking out the top $k$ points in terms of their distance from the origin.

And as we’ll see, the time complexity is much better than $O(n . k)$ .

The Euclidean distance between a point P(x, y) and the origin can be calculated using the following formula:

\sqrt{x^2 + y^2}

Given that we can now calculate the distance between $(0, 0)$ and all the points, how will we find the $k$ nearest points?

As discussed above, the heap data structure is ideal for this purpose—we’ll use a custom comparison function to define the order of the elements in a heap. Since we plan to populate the heap with coordinate pairs, we’ll define a class Point and implement a custom less-than Comparator class in it for use by the heapify process. In this function, we compare the distances of the two points relative to the origin. The point closer to the origin will be considered less than the other point. We’ll iterate through the given array of points, and if we find one that is closer to the origin than the point at the root of the max-heap, we do the following two things:

Pop from the max-heap—that is, remove the point in the heap farthest from the origin.
Push the point that is closer to the origin onto the max-heap.

As we move through the given array of points, this will ensure that we always have the $k$ points in our heap that are the closest to the origin.

Below is an illustration of this process.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.

Getting Started

Two Pointers

Fast and Slow Pointers

Sliding Window

Merge Intervals

In-Place Manipulation of a Linked List

Two Heaps

K-way merge

Top K Elements

Modified Binary Search

Subsets

Greedy Techniques

Backtracking

Dynamic Programming

Cyclic Sort

Topological Sort

Matrices

Stacks

Graphs

Tree Depth-First Search

Tree Breadth-First Search

Trie

Hash Maps

Knowing What to Track

Union Find

Custom Data Structures

Bitwise Manipulation

Challenge Yourself

Conclusion

Solution: K Closest Points to Origin

Statement

Solution

Naive approach

Optimized approach using top K elements: