Search⌘ K
AI Features

Model Training Using Unscaled Data

Explore the process of training a K-Nearest Neighbors model using unscaled features. Learn to separate data, create a training set, fit the model with an initial neighbor value, and evaluate its performance with confusion matrix and classification metrics. Understand the impact of unscaled data on model accuracy and prepare for optimizing parameters.

Let's move on and separate the features and the target in (X, y) and then split the data into the train (X_train, y_train) and test (X_test, y_test) using train_test_split().

Python 3.8
from sklearn.model_selection import train_test_split
# Separating features and the target
X = df.drop('Result', axis = 1) # features in X
y = df['Result'] # targets/labels in y
print("features are in X and the target is in y now! ")
# Splitting data
test_size=0.30; random_state=42
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=test_size,random_state=random_state)
print("train and test data sets are ready with test_size = {} and random_state = {}".format(
test_size,random_state))

Since we have our data ready, let's train a model.

Model training on unscaled data

Our focus is to develop a model that can predict the class in the Result column for any new data point. For the KNN algorithm, the ...