The **k-nearest neighbors (KNN)** algorithm is a supervised machine learning algorithm.

KNN assumes that similar things exist in close proximity. In data science, it implies that similar data points are close to each other. KNN uses similarity to calculate the distance between points on a graph.

- The algorithm calculates the distance of a new data point to all other training data points. The distance can be of any type, e.g., Euclidean, Manhattan, etc.
- The algorithm then selects the k-nearest data points, where
**k**can be any integer. It makes its selection based on the proximity to other data points regardless of what feature the numerical values represent. - Finally, it assigns the data point to the class where similar data points lie.

Pick a value for k (e.g., 3).

The `KNeighborsClassifier`

function can be imported from the `sklearn`

library. The function takes the value for `n_neighbors`

as a parameter. This specifies the value for *k*. The below example demonstrates the algorithm on the *Iris* dataset.

from sklearn.neighbors import KNeighborsClassifierfrom sklearn.model_selection import train_test_splitfrom sklearn.datasets import load_iris# Loading datairisData = load_iris()# Create feature and target arraysX = irisData.datay = irisData.target# Split into training and test setX_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=42)knn = KNeighborsClassifier(n_neighbors=7) # k = 7knn.fit(X_train, y_train)# Calculate the accuracy of the modelprint("Accuracy:", knn.score(X_test, y_test))

Copyright ©2024 Educative, Inc. All rights reserved

TRENDING TOPICS