Classification using SVM, KNN, RandomForestClassifier, and PCA
Explore how to build a classification web app using SVM, KNN, and RandomForestClassifier models with Streamlit and sklearn. Learn to load datasets, tune models, apply PCA for 2D and 3D visualization, and evaluate model performance with training and testing accuracy scores.
Helper functions
Let’s create some helper functions to load the datasets and models.
Function to get the dataset
Let’s create a function named return_data() that helps us to load the datasets.
def return_data(dataset):
if dataset == 'Wine':
data = load_wine()
elif dataset == 'Iris':
data = load_iris()
else:
data = load_breast_cancer()
df = pd.DataFrame(data.data, columns=data.feature_names , index=None)
df['Type'] = data.target
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, random_state=1, test_size=0.2)
return X_train, X_test, y_train, y_test,df,data.target_names
- The function
return_data(dataset)takes a string that contains the name of thedatasetthe user selects. - It loads the relevant dataset.
- We create a DataFrame
dfthat we can show in our UI. - We use sklearn’s
train_test_split()