An Introductory Guide to Data Science and Machine Learning/

...

Model Evaluation Part 2

In this lesson, we look into some advanced evaluation measures for classification models that also help in troubleshooting the model’s performance.

We'll cover the following...

Balanced accuracy
ROC Curve
PR Curve

Balanced accuracy

This measure is intended for datasets where the class distribution is skewed i.e one class label (e.g. 1) contains more instances than the other (e.g. 0). Here the class label having more number of instances is referred to as the Majority Class and the class label having less number of instances is called Minority Class.

The model trained on imbalance dataset and evaluated using Accuracy measure cannot be trusted.
This model tends to predict the majority class all the time, and the accuracy comes out to be 98%, which seems good, but is misleading. The model is unable to learn the minority class.

In order to evaluate such models other evaluation Measures comes to the rescue and Balanced Accuracy is one of them. Balanced Accuracy is accuracy where each instance is weighted according to the inverse prevalence of its true class. It is computed by taking the average of recall for each class. This score, for balanced datasets, depicts the Accuracy Score.

For Binary Class Classification, balanced accuracy is given as.

Balanced\_Accuracy = \frac{1}{2}(\frac{TP}{TP + FN}\frac{TN}{TN+FP})

What is Data Science ?

Applications of Data Science

Overview of Libraries

Probability and Statistics

Machine Learning Part-1

Machine Learning Part-2

Machine Learning Part-3

Deep Learning

Machine Learning Tools and Libraries

Big Data Tools and Technologies

Where to go next ?

Model Evaluation Part 2

Balanced accuracy