Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

svm
python
machine learning
communitycreator

How to implement SVM in Python using Scikit-learn

Zain Ali Babar

Overview

Support Vector Machine (SVM) is a simple, supervised machine learning algorithm. SVMs are used for both classification and regression problems.

Note: You can learn more about SVMs here.

In this shot, we will implement an SVM classifier using the Scikit-learn toolkit.

We will use the digits dataset to train the SVM classifier model from scikit-learn. We split the data into train and test (70-30 split) to make sure the classification algorithm is able to generalize well to unseen data.

The model trained uses the learned parameters to classify into one of ten classes, that is, 0 to 9.

Code example

# Importing the necessary libraries 
import numpy as np
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix, accuracy_score 

# Importing the dataset from the sklearn library into a local variable called dataset
dataset = load_digits()

# Splitting the data test into train 70% and test 30%.
# x_train, y_train are training data and labels respectively 
# x_test, y_test are testing data and labels respectively 
x_train, x_test, y_train, y_test = train_test_split(dataset.data, dataset.target, test_size=0.30, random_state=4)

# Making the SVM Classifer
Classifier = SVC(kernel="linear")

# Training the model on the training data and labels
Classifier.fit(x_train, y_train)

# Using the model to predict the labels of the test data
y_pred = Classifier.predict(x_test)

# Evaluating the accuracy of the model using the sklearn functions
accuracy = accuracy_score(y_test,y_pred)*100
confusion_mat = confusion_matrix(y_test,y_pred)

# Printing the results
print("Accuracy for SVM is:",accuracy)
print("Confusion Matrix")
print(confusion_mat)
Implementing an SVM classifier

Explanation

  • Line 3: We import the load_digits dataset from the sklearn library.
  • Line 4: We import the train_test_split function from sklearn to split the data into train and test samples.
  • Line 5: We import the SVC classifier from sklearn.
  • Line 6: We use the sklearn provided confusion_matrix and accuracy_score functions.
  • Line 9: We load the dataset into a local variable called dataset.
  • Line 14: We split the data into test and train datasets. We use a 70-30 split, where 70% of the data is train and 30% is test. x-train and y_train contain the training data and labels respectively, while x_test and y_test contain the testing data and labels.
  • Line 17: We define an SVM classifier called Classifier using a linear kernel.
  • Line 20: We train the model on the training data and labels.
  • Line 23: We use the trained parameters learned from the training data to predict the labels of the test data.
  • Line 26: We use the accuracy_score function and predicted labels to find the accuracy of the model. We multiply by 100 to get the accuracy out of 100.
  • Line 27: We use the predicted labels to find the confusion matrix.
  • Line 30 to 32: We print the evaluation scores for the model.

Model performance

The SVM classifier we defined above gives a 98% accuracy on the digits dataset. The confusion matrix analysis shows that the model is performing really well.

RELATED TAGS

svm
python
machine learning
communitycreator
RELATED COURSES

View all Courses

Keep Exploring