Trusted answers to developer questions

How to create a confusion matrix in Python using scikit-learn

Get the Learn to Code Starter Pack

Break into tech with the logic & computer science skills you’d learn in a bootcamp or university — at a fraction of the cost. Educative's hand-on curriculum is perfect for new learners hoping to launch a career.

A confusion matrix is a tabular summary of the number of correct and incorrect predictions made by a classifier. It can be used to evaluate the performance of a classification model through the calculation of performance metrics like accuracy, precision, recall, and F1-score.

Suppose that a classifier produces the following results:

svg viewer

Code

The following code snippet shows how to create a confusion matrix and calculate some important metrics using a Python library called scikit-learn (also known​ as sklearn):

# Importing the dependancies
from sklearn import metrics
# Predicted values
y_pred = ["a", "b", "c", "a", "b"]
# Actual values
y_act = ["a", "b", "c", "c", "a"]
# Printing the confusion matrix
# The columns will show the instances predicted for each label,
# and the rows will show the actual number of instances for each label.
print(metrics.confusion_matrix(y_act, y_pred, labels=["a", "b", "c"]))
# Printing the precision and recall, among other metrics
print(metrics.classification_report(y_act, y_pred, labels=["a",
"b","c"]))

Explanation

y_pred is a list that holds the predicted labels. y_act contains the actual labels.

metrics.confusion_matrix() takes in the list of actual labels, the list of predicted labels, and an optional argument to specify the order of the labels. It calculates the confusion matrix for the given inputs.

metrics.classification_report() takes in the list of actual labels, the list of predicted labels, and an optional argument to specify the order of the labels. It calculates performance metrics like precision, recall, and support.

RELATED TAGS

python
sklearn
confusion matrix
data mining
Copyright ©2024 Educative, Inc. All rights reserved
Did you find this helpful?