Mean absolute error in sklearn
Scikit-learn is a Python-focused library mainly utilized in machine learning tasks, including classification, regression, clustering, and model selection.
Note: To get hands-on practice in Scikit-learn, you can explore the course
Scikit-learn model training
While training a model we've built from scratch, it is crucial to focus on its accuracy. Various metrics can be used to specify how well a model is trained. In this Answer, we will be covering one such metric called the mean absolute error.
Model validation
We can confirm the accuracy of a trained model by passing validation data to it and observing the results. Here, we compare our predicted and actual values for the data, and this is precisely where the mean absolute error comes in handy.
Mean absolute error
Mean absolute error, abbreviated as MAE, is a metric used to measure the average absolute difference between the predicted and actual values in a regression problem i.e. modeling various relations of variables. In deep learning, MAE is used as a
Simply put, MAE tells us how much our predictions are off from the actual values in the dataset. It helps us understand the accuracy of our model by measuring the absolute errors between predicted and true values.
Interpreting our results
A lower MAE means our model's predictions are closer to the actual values. This indicates better performance.
Mathematical representation
To calculate MAE, we take the absolute difference between each predicted value and its corresponding actual value. Then, we add up all these absolute differences and divide the sum by the total number of data points.
Formula
Mechanism
The manual calculation of the mean absolute error is represented by the code below.
actual_values = [25, 30, 20, 35, 41]predicted_values = [23, 28, 19, 33, 38]n = len(actual_values)absolute_differences = [abs(actual_values[i] - predicted_values[i]) for i in range(n)]sum_of_absolute_differences = sum(absolute_differences)mae = sum_of_absolute_differences / nprint("Actual values = ", actual_values)print("Predicted values = ", predicted_values)print("Absolute differences = ", absolute_differences)print("Sum of absolute difference = ", sum_of_absolute_differences)print("The MAE = ", mae)
First off, we begin by defining sample data with
actual_valuesandpredicted_values,Next, we calculate the number of data points
nin our dataset.We then calculate the absolute differences between both values for each data point using a list comprehension.
abs()is used to get the absolute value of these differences.Then, we sum these differences using
sum().Finally, we calculate the MAE by dividing the sum by the number of data points
n.We print our variables to understand the code better.
Code sample
from sklearn.metrics import mean_absolute_errory_true = [3.5, 2.1, 5.2, 7.8, 4.6]y_pred = [3.0, 2.5, 4.8, 8.0, 5.2]mae = mean_absolute_error(y_true, y_pred)print("MAE = ", mae)
Code explanation
Line 1: We import the
mean_absolute_errormethod from thesklearn.metricsmodule.Lines 4–5: We define our target i.e.
trueand predictedpredvalues.Line 8: We make use of the pre-built function to calculate MAE and make the code compact for us.
Line 9: Finally, we print the calculated error.
abs() function is used to
find how well the data has been trained
MAE is used to
calculate the absolute value of the variable passed to it
Free Resources