Trusted answers to developer questions

What is the k-nearest neighbor algorithm?

Get Started With Data Science

Learn the fundamentals of Data Science with this free course. Future-proof your career by adding Data Science skills to your toolkit — or prepare to land a job in AI, Machine Learning, or Data Analysis.

The k-nearest neighbors (KNN) algorithm is a supervised machine learning algorithm.

KNN assumes that similar things exist in close proximity. In data science, it implies that similar data points are close to each other. KNN uses similarity to calculate the distance between points on a graph.

How it works

The algorithm calculates the distance of a new data point to all other training data points. The distance can be of any type, e.g., Euclidean, Manhattan, etc.
The algorithm then sorts the calculated Euclidean distances in ascending order.
Next, the algorithm selects the k-nearest data pointsthe first k points, where k can be any integer. The selection based on the proximity to other data points regardless of what feature the numerical values represent.
Finally, the algorithm assigns the data point to the class where similar data points lie.

Advantages

Very easy to implement.
This algorithm can be used for both classification and regression.
Since data is not previously assumed, it is very useful in cases of nonlinear data.
The algorithm ensures relatively high accuracy.

Disadvantages

It is a bit more expensive as it stores the entire training data.
High memory storage requirements for this algorithm.
Higher sets of values may lead to inaccurate predictions.
Highly sensitive to the scale of the data.

Where is KNN used?

The following are some of the areas in which KNN can be applied successfully:

KNN is often used in banking systems to identify if an individual or organization is fit for a grant or a loan based on key characteristics.
KNN can be used in Speech Recognition, Handwriting Detection, Image Recognition, and Video Recognition.
A potential voter can be classified into categories based on characteristics (like “voter” or “non-voter”) for elections.

RELATED TAGS

machine learning

CONTRIBUTOR

Sarvech Qadir

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments