Visualize the Working of K-Nearest Neighbors
Learn to visualize the working principle behind k-nearest neighbors.
We'll cover the following...
Let’s move on and practically do what we have learned so far. As always, we need to import some basic libraries.
Let's generate a dataset with two classes and see how the KNN algorithm works in reality for any new data points while assigning the class.
The dataset
We can use make_biclusters() from scikit-learn to create a simple dataset with two features (columns) and 50 observations (data points). We can also add Gaussian noise while creating clusters and assign them a class. Let's do this.
Let's check the class distribution.
As seen from the code output above, we have the data with two features and a target column.
Visualize training and the test data
Let's create a scatterplot and visualize the distribution of data points. We can use the hue parameter for classes to show in different colors. In another plot (right side), we can add a test point for which the class is unknown, and we want KNN to predict its class.
The red star is a new unknown data point that we want our KNN algorithm to predict, and for this purpose, we need to perform the following ...