What is Adjusted R-Squared

Adjusted R-Squared is a modified form of R-Squared whose value increases if new predictors tend to improve models performance and decreases if new predictors does not improve performance as expected. R-squared is a comparison of Residual sum of squares (SSres) with total sum of squares(SStot). It is calculated by dividing sum of squares of residuals from the regression model by total sum of squares of errors from the average model and then subtract it from 1. Calculate the Adjusted r2 Coefficient of Determination Unlike R-squared, the Adjusted R-squared would penalize you for adding features which are not useful for predicting the target. It takes into account the number of independent variables used for predicting the target variable. Adjusted Coefficient of Determination

where,

  1. N = number of records in the data set.
  2. p = number of independent variables.

For a simple representation, you can rewrite the above formula like the following:

Adjusted R-squared = 1 — (x * y)

where,

  1. x = 1 — R Squared
  2. y = (N-1) / (n-p-1)

Adjusted R-squared can be negative when R-squared is close to zero.
Adjusted R-squared value always be less than or equal to R-squared value.

Adjusted R-squared manual calculation

import numpy as np actual = np.array([56,45,68,49,26,40,52,38,30,48]) predicted = np.array([58,42,65,47,29,46,50,33,31,47])
#calculate r-squared r2 = 1 - np.sum((predicted - actual)**2) / np.sum((actual - np.mean(actual))**2) N=actual.shape[0] p=3 x = (1-r2) y = (N-1) / (N-p-1) adj_rsquared = (1 - (x * y)) print("Adjusted-R2 : " , adj_rsquared)
Adjusted-R2 : 0.8894189071986123

Adjusted R-squared using sklearn.metrics

import sklearn.metrics as metrics actual = np.array([56,45,68,49,26,40,52,38,30,48]) predicted = np.array([58,42,65,47,29,46,50,33,31,47])
#calculate r-squared r2_sk = metrics.r2_score(actual,predicted) N=actual.shape[0] p=3 x = (1-r2) y = (N-1) / (N-p-1) adj_rsquared = (1 - (x * y)) print("Adjusted-R2 : " , adj_rsquared)
Adjusted-R2 : 0.8894189071986123