Metrics To Evaluate Machine Learning Algorithms

Evaluating a model is the key part of building a successful Machine Learning model . The purpose of evaluating a model is to compare the trained model predictions with the actual data. It helps you to realize the performance of your model and makes it easy to present your model to the audience. Different error metrics are used for different kinds of machine learning models. There are three evaluation metrics that are frequently used for evaluating the performance of a regression model. They are:
  1. Mean Squared Error (MSE)
  2. Root Mean Squared Error (RMSE)
  3. Mean Absolute Error (MAE)

Mean Squared Error (MSE)

Mean Squared Error ( MSE ) is defined as Mean or Average of the square of the difference between actual and estimated values. This means that MSE is calculated by the square of the difference between the predicted and actual target variables, divided by the number of data points. It is always non–negative values and close to zero are better.

To understand it better, let us take an example of actual demand and predicted demand for a brand of Mineral Water in a shop.


what is mean squared error
Square of the difference between the predicted and actual = 102

Number of data points = 10

Mean Squared Error (MSE) = 102/10 = 10.2

An ideal Mean Squared Error (MSE) value is 0.0, which means that all predicted values matched the expected values exactly. MSE is most useful when the dataset contains outliers , or unexpected values (too high values or too low values).

Mean Squared Error manual calculation

import numpy as np actual = np.array([56,45,68,49,26,40,52,38,30,48]) predicted = np.array([58,42,65,47,29,46,50,33,31,47]) diff = actual - predicted mse_m = np.mean(diff**2) print("Mean Squared Error :", mse_m)
Mean Squared Error : 10.2

Mean Squared Error using sklearn

import numpy as np import sklearn.metrics as metrics
actual = np.array([56,45,68,49,26,40,52,38,30,48]) predicted = np.array([58,42,65,47,29,46,50,33,31,47]) mse_sk = metrics.mean_squared_error(actual, predicted) print("Mean Squared Error :", mse_sk)
Mean Squared Error : 10.2


what is root mean square error In the above diagram, forecasted values are points on the red line and actual values are shown by blue small circles. Error in prediction is shown as the distance between the data point and fitted line. Mean Squared Error for the line is calculated as the average of the sum of squares for all data points.

Root Mean Squared Error

Root Mean Square Error ( RMSE ) is also used as a measure for model evaluation. It is the square root of Mean Squared Error (MSE). This is the same as Mean Squared Error (MSE) but the root of the value is considered while determining the accuracy of the model. RMSE = sqrt(MSE)

An ideal Root Mean Square Error (RMSE) value is 0.0, which means that all prediction values matched the expected values exactly.

Root Mean Squared Error manual Calculation

import numpy as np actual = np.array([56,45,68,49,26,40,52,38,30,48]) predicted = np.array([58,42,65,47,29,46,50,33,31,47]) diff = actual - predicted mse_m = np.mean(diff**2) rmse_m = np.sqrt(mse_m) print("Root Mean Square Error :", rmse_m)
Root Mean Square Error : 3.1937438845342623

Root Mean Squared Error using sklearn

import numpy as np import sklearn.metrics as metrics
actual = np.array([56,45,68,49,26,40,52,38,30,48]) predicted = np.array([58,42,65,47,29,46,50,33,31,47]) mse_sk = metrics.mean_squared_error(actual, predicted) rmse_sk = np.sqrt(mse) print("Root Mean Square Error :", rmse_sk)
Root Mean Square Error : 3.1937438845342623

Mean Absolute Error (MAE)

Mean Absolute Error ( MAE ) is the sum of the absolute difference between actual and predicted values. Absolute difference means that if the result has a negative sign, it is ignored. MAE = Expected values – Predicted values

Mean Absolute Error manual calculation

import numpy as np actual = np.array([56,45,68,49,26,40,52,38,30,48]) predicted = np.array([58,42,65,47,29,46,50,33,31,47]) diff = actual - predicted mae_m = np.mean(abs(diff)) print("Mean Absolute Error :", mae_m)
Mean Absolute Error r : 2.8

Mean Absolute Error using sklearn

import numpy as np import sklearn.metrics as metrics
actual = np.array([56,45,68,49,26,40,52,38,30,48]) predicted = np.array([58,42,65,47,29,46,50,33,31,47]) mae_sk = metrics.mean_absolute_error(actual, predicted) print("Mean Absolute Error :", mae_sk)
Mean Absolute Error : 2.8

An ideal Mean Absolute Error (MAE) value is 0.0, which means that all prediction values matched the expected values exactly.

Full Source: - MSE, RMSE and MAE
import numpy as np import sklearn.metrics as metrics actual = np.array([56,45,68,49,26,40,52,38,30,48]) predicted = np.array([58,42,65,47,29,46,50,33,31,47])
# Manual Calculation diff = actual - predicted mse_m = np.mean(diff**2) rmse_m = np.sqrt(mse_m) mae_m = np.mean(abs(diff)) print("Results by Manual Calculation:") print("Mean Squared Error :", mse_m) print("Root Mean Square Error :", rmse_m) print("Mean Absolute Error :",mae_m)
# sklearn.metrics mse_sk = metrics.mean_squared_error(actual, predicted) rmse_sk = np.sqrt(mse) mae_sk = metrics.mean_absolute_error(actual, predicted) print("Results of sklearn.metrics:") print("Mean Squared Error :", mse_sk) print("Root Mean Square Error :", rmse_sk) print("Mean Absolute Error :",mae_sk)
Results by Manual Calculation: Mean Squared Error : 10.2 Root Mean Square Error : 3.1937438845342623 Mean Absolute Error : 2.8
Results of sklearn.metrics: Mean Squared Error : 10.2 Root Mean Square Error : 3.1937438845342623 Mean Absolute Error : 2.8
The results of the three evaluation metrics ( MSE, RMSE and MAE ) are the same in both methods . You can use any method (manual or sklearn ) according to your convenience in your Regression Analysis .