Mean absolute error in sklearn

Scikit-learn is a Python-focused library mainly utilized in machine learning tasks, including classification, regression, clustering, and model selection.

Note: To get hands-on practice in Scikit-learn, you can explore the course
Hands on machine learning with scikit-learn.

Scikit-learn model training

While training a model we've built from scratch, it is crucial to focus on its accuracy. Various metrics can be used to specify how well a model is trained. In this Answer, we will be covering one such metric called the mean absolute error.

Model validation

We can confirm the accuracy of a trained model by passing validation data to it and observing the results. Here, we compare our predicted and actual values for the data, and this is precisely where the mean absolute error comes in handy.

Mean absolute error

Mean absolute error, abbreviated as MAE, is a metric used to measure the average absolute difference between the predicted and actual values in a regression problem i.e. modeling various relations of variables. In deep learning, MAE is used as a loss functionquantifies how well the model's predictions match the actual values during model training.

Simply put, MAE tells us how much our predictions are off from the actual values in the dataset. It helps us understand the accuracy of our model by measuring the absolute errors between predicted and true values.

Interpreting our results

A lower MAE means our model's predictions are closer to the actual values. This indicates better performance.

Mathematical representation

To calculate MAE, we take the absolute difference between each predicted value and its corresponding actual value. Then, we add up all these absolute differences and divide the sum by the total number of data points.

Formula

$\mathrm {MeanAbsoluteError}= \frac {\sum _{i=1}^{n} {|y_i-x_i|}}{n}$

Mechanism

The manual calculation of the mean absolute error is represented by the code below.


actual_values = [25, 30, 20, 35, 41]
predicted_values = [23, 28, 19, 33, 38]
n = len(actual_values)
absolute_differences = [abs(actual_values[i] - predicted_values[i]) for i in range(n)]
sum_of_absolute_differences = sum(absolute_differences)
mae = sum_of_absolute_differences / n
print("Actual values = ", actual_values)
print("Predicted values = ", predicted_values)
print("Absolute differences = ", absolute_differences)
print("Sum of absolute difference = ", sum_of_absolute_differences)
print("The MAE = ", mae)

First off, we begin by defining sample data with actual_values and predicted_values,
Next, we calculate the number of data points n in our dataset.
We then calculate the absolute differences between both values for each data point using a list comprehension. abs() is used to get the absolute value of these differences.
Then, we sum these differences using sum().
Finally, we calculate the MAE by dividing the sum by the number of data points n.
We print our variables to understand the code better.