What is the f1_score function in Sklearn?
Overview
In Python, the f1_score function of the sklearn.metrics package calculates the F1 score for a set of predicted labels.
The F1 score is the harmonic mean of precision and recall, as shown below:
F1_score = 2 * (precision * recall) / (precision + recall)
An F1 score can range between , with 0 being the worst score and 1 being the best.
To use the f1_score function, we’ll import it into our program, as shown below:
from sklearn.metrics import f1_score
Syntax
sklearn.metrics.f1_score(y_true, y_pred, *, labels=None, pos_label=1, average='binary', sample_weight=None, zero_division='warn')
Parameters
The f1_score function accepts the following parameters:
-
y_true: These are the true labels. -
y_pred: These are the predicted labels. -
labels: This parameter identifies the labels to be included when there is a multiclass problem. -
pos_label: This is the class to report in case of a binary classification problem. -
average: This is the type of averaging to be performed in the case of multiclass data. -
sample_weight: These are any sample weights to be used in the calculation of the F1 score.
Note: Find a comprehensive list of parameters and their possible values here.
Return value
This function returns the F1 score of the positive class for binary classification problems or the weighted average of the F1 scores of each class for multiclass problems.
Example
from sklearn.metrics import f1_score# define true labelstrue_labels = ["a", "c", "b", "a"]# define corresponding predicted labelspred_labels = ["c", "c", "b", "a"]# find f1 scores for different weighted averagesscore = f1_score(true_labels, pred_labels, average="macro")print("Macro F1-Score: ", score)score = f1_score(true_labels, pred_labels, average="micro")print("Micro F1-Score: ", score)score = f1_score(true_labels, pred_labels, average="weighted")print("Weighted F1-Score: ", score)
Explanation
-
Line 1: We import the
f1_scorefunction from thesklearn.metricslibrary. -
Lines 4–7: We define the true labels and predicted labels. As there are 3 classes (
a,b,c), this is a multiclass problem. -
Line 11: We calculate the macro-average of the predicted classes through the
F1_scorefunction. The calculated score is output accordingly. -
Line 14: We calculate the micro-average of the predicted classes through the
F1_scorefunction. The calculated score is output accordingly. -
Line 17: We calculate the weighted average of the predicted classes through the
F1_scorefunction. The calculated score is output accordingly.