Search⌘ K
AI Features

Classification using SVM, KNN, RandomForestClassifier, and PCA

Explore how to build a classification web app using SVM, KNN, and RandomForestClassifier models with Streamlit and sklearn. Learn to load datasets, tune models, apply PCA for 2D and 3D visualization, and evaluate model performance with training and testing accuracy scores.

Helper functions

Let’s create some helper functions to load the datasets and models.

Function to get the dataset

Let’s create a function named return_data() that helps us to load the datasets.

def return_data(dataset):
    if dataset == 'Wine':
        data = load_wine()
    elif dataset == 'Iris':
        data = load_iris()
    else:
        data = load_breast_cancer()
    df = pd.DataFrame(data.data, columns=data.feature_names , index=None)
    df['Type'] = data.target
    X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, random_state=1, test_size=0.2)
    return X_train, X_test, y_train, y_test,df,data.target_names
  • The function return_data(dataset) takes a string that contains the name of the dataset the user selects.
  • It loads the relevant dataset.
  • We create a DataFrame df that we can show in our UI.
  • We use sklearn’s train_test_split()
...