Support Vector Machine (SVM) is a simple, supervised machine learning algorithm. SVMs are used for both classification and regression problems.
Note: You can learn more about SVMs here.
In this shot, we will implement an SVM classifier using the Scikit-learn toolkit.
We will use the digits
dataset to train the SVM classifier model from scikit-learn. We split the data into train and test (70-30 split) to make sure the classification algorithm is able to generalize well to unseen data.
The model trained uses the learned parameters to classify into one of ten classes, that is, 0
to 9
.
# Importing the necessary libraries import numpy as np from sklearn.datasets import load_digits from sklearn.model_selection import train_test_split from sklearn.svm import SVC from sklearn.metrics import confusion_matrix, accuracy_score # Importing the dataset from the sklearn library into a local variable called dataset dataset = load_digits() # Splitting the data test into train 70% and test 30%. # x_train, y_train are training data and labels respectively # x_test, y_test are testing data and labels respectively x_train, x_test, y_train, y_test = train_test_split(dataset.data, dataset.target, test_size=0.30, random_state=4) # Making the SVM Classifer Classifier = SVC(kernel="linear") # Training the model on the training data and labels Classifier.fit(x_train, y_train) # Using the model to predict the labels of the test data y_pred = Classifier.predict(x_test) # Evaluating the accuracy of the model using the sklearn functions accuracy = accuracy_score(y_test,y_pred)*100 confusion_mat = confusion_matrix(y_test,y_pred) # Printing the results print("Accuracy for SVM is:",accuracy) print("Confusion Matrix") print(confusion_mat)
load_digits
dataset from the sklearn
library.train_test_split
function from sklearn
to split the data into train and test samples.SVC
classifier from sklearn
.sklearn
provided confusion_matrix
and accuracy_score
functions.dataset
.x-train
and y_train
contain the training data and labels respectively, while x_test
and y_test
contain the testing data and labels.Classifier
using a linear kernel.accuracy_score
function and predicted labels to find the accuracy of the model. We multiply by 100 to get the accuracy out of 100.The SVM classifier we defined above gives a 98% accuracy on the digits dataset. The confusion matrix analysis shows that the model is performing really well.
RELATED TAGS
CONTRIBUTOR
View all Courses