Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

python
sklearn
confusion matrix
data mining

How to create a confusion matrix in Python using scikit-learn

Educative Answers Team

A confusion matrix is a tabular summary of the number of correct and incorrect predictions made by a classifier. It can be used to evaluate the performance of a classification model through the calculation of performance metrics like accuracy, precision, recall, and F1-score.

Suppose that a classifier produces the following results:

svg viewer

Code

The following code snippet shows how to create a confusion matrix and calculate some important metrics using a Python library called scikit-learn (also known​ as sklearn):

# Importing the dependancies
from sklearn import metrics
# Predicted values
y_pred = ["a", "b", "c", "a", "b"]
# Actual values
y_act = ["a", "b", "c", "c", "a"]
# Printing the confusion matrix
# The columns will show the instances predicted for each label,
# and the rows will show the actual number of instances for each label.
print(metrics.confusion_matrix(y_act, y_pred, labels=["a", "b", "c"]))
# Printing the precision and recall, among other metrics
print(metrics.classification_report(y_act, y_pred, labels=["a", 
"b","c"]))

Explanation

y_pred is a list that holds the predicted labels. y_act contains the actual labels.

metrics.confusion_matrix() takes in the list of actual labels, the list of predicted labels, and an optional argument to specify the order of the labels. It calculates the confusion matrix for the given inputs.

metrics.classification_report() takes in the list of actual labels, the list of predicted labels, and an optional argument to specify the order of the labels. It calculates performance metrics like precision, recall, and support.

RELATED TAGS

python
sklearn
confusion matrix
data mining
Copyright ©2022 Educative, Inc. All rights reserved
RELATED COURSES

View all Courses

Keep Exploring