The Essential k-NN Algorithm
Learn about the k-NN algorithm and its implementation using different modules.
We'll cover the following...
We'll cover the following...
Overview
We can summarize the -NN algorithm as having the following steps:
- Create a list of all (distance, training sample) pairs.
- Sort these in ascending order.
- Pick to the first , which will be the k nearest neighbors.
- Chose the mode (the highest frequency) label for the nearest neighbors.
The implementation would look like this:
While clear, this does accumulate a large number of distance values in the distances list object, when only are actually needed. The sorted() function consumes the source generator and creates a (potentially large) list of intermediate values.
One of the high-cost parts of this specific -NN algorithm is sorting the entire set of training data after the distances have been computed. We summarize the complexity with the description as an operation. A way to avoid cost is to avoid sorting the entire set of distance computations.
Steps ...