Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

machine learning

How to implement SVM in Python using Scikit-learn

Zain Ali Babar


Support Vector Machine (SVM) is a simple, supervised machine learning algorithm. SVMs are used for both classification and regression problems.

Note: You can learn more about SVMs here.

In this shot, we will implement an SVM classifier using the Scikit-learn toolkit.

We will use the digits dataset to train the SVM classifier model from scikit-learn. We split the data into train and test (70-30 split) to make sure the classification algorithm is able to generalize well to unseen data.

The model trained uses the learned parameters to classify into one of ten classes, that is, 0 to 9.

Code example

# Importing the necessary libraries 
import numpy as np
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix, accuracy_score 

# Importing the dataset from the sklearn library into a local variable called dataset
dataset = load_digits()

# Splitting the data test into train 70% and test 30%.
# x_train, y_train are training data and labels respectively 
# x_test, y_test are testing data and labels respectively 
x_train, x_test, y_train, y_test = train_test_split(,, test_size=0.30, random_state=4)

# Making the SVM Classifer
Classifier = SVC(kernel="linear")

# Training the model on the training data and labels, y_train)

# Using the model to predict the labels of the test data
y_pred = Classifier.predict(x_test)

# Evaluating the accuracy of the model using the sklearn functions
accuracy = accuracy_score(y_test,y_pred)*100
confusion_mat = confusion_matrix(y_test,y_pred)

# Printing the results
print("Accuracy for SVM is:",accuracy)
print("Confusion Matrix")
Implementing an SVM classifier


  • Line 3: We import the load_digits dataset from the sklearn library.
  • Line 4: We import the train_test_split function from sklearn to split the data into train and test samples.
  • Line 5: We import the SVC classifier from sklearn.
  • Line 6: We use the sklearn provided confusion_matrix and accuracy_score functions.
  • Line 9: We load the dataset into a local variable called dataset.
  • Line 14: We split the data into test and train datasets. We use a 70-30 split, where 70% of the data is train and 30% is test. x-train and y_train contain the training data and labels respectively, while x_test and y_test contain the testing data and labels.
  • Line 17: We define an SVM classifier called Classifier using a linear kernel.
  • Line 20: We train the model on the training data and labels.
  • Line 23: We use the trained parameters learned from the training data to predict the labels of the test data.
  • Line 26: We use the accuracy_score function and predicted labels to find the accuracy of the model. We multiply by 100 to get the accuracy out of 100.
  • Line 27: We use the predicted labels to find the confusion matrix.
  • Line 30 to 32: We print the evaluation scores for the model.

Model performance

The SVM classifier we defined above gives a 98% accuracy on the digits dataset. The confusion matrix analysis shows that the model is performing really well.


machine learning

View all Courses

Keep Exploring