How to calculate Inter-Quartile Range (IQR)

The Inter-Quartile Range (IQR) is a way to measure the spread of the middle 50% of a dataset. It is the difference between the 75th percentile Q3 (0.75 quartile) and the 25th percentile Q1 (0.25 quartile)of a dataset. Also, it can be used to detect outliers in the data. IQR = Q3 – Q1

Interquartile Range of a single array

import numpy as np #define data data = np.array([18, 22, 32, 38, 41, 46, 53, 58, 67, 71, 78, 84, 91, 98]) #find quarter-3 and quarter-1 q3, q1 = np.percentile(data, [75 ,25]) #calculate the interquartile range iqr = q3 - q1 print("Interquartile Range : " , iqr)
Interquartile Range : 37.5

Interquartile Range of a single column in a DataFrame

import pandas as pd import numpy as np df = pd.DataFrame([[32, 24, 30, 40], [17, 24, 21, 28], [50, 25, 28, 32], [25, 34, 21, 48], [17, 31, 18, 28], [35, 24, 19, 42]], columns=['Physics', 'Chemistry', 'Biology', 'Maths'], index=['Student-1', 'Student-2', 'Student-3', 'Student-4', 'Student-5', 'Student-6']) #find quarter-3 and quarter-1 q3, q1 = np.percentile(df['Chemistry'], [75 ,25]) #calculate the interquartile range iqr = q3 - q1 print("Interquartile Range : " , iqr)
Physics Chemistry Biology Maths Student-1 32 24 30 40 Student-2 17 24 21 28 Student-3 50 25 28 32 Student-4 25 34 21 48 Student-5 17 31 18 28 Student-6 35 24 19 42
Interquartile Range : 5.5

Interquartile Range of multiple columns in a DataFrame

If you want to find Inter-Quartile Range of multiple columns in a DataFrame, you have to define function to calculate interquartile range of a single column in the DataFrame and then pass multiple column name to that DataFrame.
#define function to calculate interquartile range of a single column def single_iqr(x): return np.subtract(*np.percentile(x, [75, 25])) #calculate IQR for 'Physics' and 'Chemistry' columns df[['Physics', 'Chemistry']].apply(single_iqr)
Physics 15.25 Chemistry 5.50
If you want to find out Inter-Quartile Range of all columns in a DataFrame:
#calculate IQR for 'for all columns df.apply(single_iqr)
Physics 15.25 Chemistry 5.50 Biology 6.75 Maths 12.50

How to Validate?

Above coding is find the IQT from scratch. If you want to save your time, you can use iqr() function from scipy.stats.
from scipy.stats import iqr iqr(df['Physics'])
15.25

Visualization

Let’s plot the 25th percentile , the 50th percentile (median) and the 75th percentile of the DataFrame.
import pandas as pd import numpy as np import matplotlib.pyplot as plt from matplotlib.cbook import boxplot_stats import numpy as np
# plot the dataframe as needed ax = df.plot.box(figsize=(8, 6), showmeans=True) ax.grid()

dataframe Interquartile Range