# Classification Accuracy

Learn how to assess the quality of the prediction of the binary classification model.

## We'll cover the following

## Binary classification metrics with logistic regression and near-default options

Now we proceed to fit an example model to illustrate binary classification metrics. We will continue to use logistic regression with near-default options. The following code loads the model class and creates a model object.

```
from sklearn.linear_model import LogisticRegression
example_lr = LogisticRegression(C=0.1, class_weight=None,
dual=False, fit_intercept=True,
intercept_scaling=1, max_iter=100,
multi_class='auto', n_jobs=None,
penalty='l2', random_state=None,
solver='liblinear', tol=0.0001,
verbose=0, warm_start=False)
```

Now we proceed to train the model, as you might imagine, using the labeled data from our training set. We proceed immediately to use the trained model to make predictions on the features of the samples from the held-out test set:

```
example_lr.fit(X_train, y_train)
LogisticRegression(C=0.1, solver='liblinear')
```

```
# LogisticRegression(C=0.1, solver='liblinear')
```

```
y_pred = example_lr.predict(X_test)
```

## Understanding the limitations of accuracy

Weâ€™ve stored the model-predicted labels of the test set in a variable called `y_pred`

. How should we now assess the quality of these predictions? We have the true labels, in the `y_test`

variable. First, we will compute what is probably the simplest of all binary classification metrics: **accuracy**. Accuracy is defined as the proportion of samples that were correctly classified.

One way to calculate accuracy is to create a logical mask that is `True`

whenever the predicted label is equal to the actual label, and `False`

otherwise. We can then take the average of this mask, which will interpret `True`

as `1`

and `False`

as `0`

, giving us the proportion of correct classifications:

```
is_correct = y_pred == y_test
np.mean(is_correct)
```

```
# 0.7834239639977498
```

This indicates that the model is correct 78% of the time. While this is a pretty straightforward calculation, there are actually easier ways to calculate accuracy using the convenience of scikit-learn. One way is to use the trained modelâ€™s `.score`

method, passing the features of the test data to make predictions on, as well as the test labels. This method makes the predictions and then does the same calculation we performed previously, all in one step. Or, we could import scikit-learnâ€™s `metrics`

library, which includes many model performance metrics, such as `accuracy_score`

. For this, we pass the true labels and the predicted labels:

```
example_lr.score(X_test, y_test)
```

```
# 0.7834239639977498
```

```
from sklearn import metrics
metrics.accuracy_score(y_test, y_pred)
```

```
# 0.7834239639977498
```

Get hands-on with 1200+ tech skills courses.