Measures of the Spread of Data

Summarising the dataset can help you to understand the data, especially when the dataset is too large. A Measures of Central Tendency of a dataset by itself is not enough, though, to describe a distribution. The Measures of the Spread , sometimes also called a measure of dispersion, of dataset describe you how extreme the values in the dataset are. It summarise the dataset in a way that explain how scattered the values are and how much they differ from the mean value.

There are several basic measures of spread used in statistics. The most common are:

  1. Range
  2. Inter-Quartile Range (IQR)
  3. Variance
  4. Standard Deviation

Range

The Range is Define and calculate the range of a dataset. It is the difference between the smallest value in a dataset (Minimum) and the largest one (Maximum). Range = maximum - minimum

Suppose you have a dataset of some values:

12, 48, 32, 21, 32, 36, 54, 21, 78, 32, 18, 94.


Minimum value is : 12

Maximum value is : 94

Range of dataset is : 94 - 12 = 82

Quartile and Inter-Quartile Range (IQR)

Quartile

Quartiles are the values that divide a dataset into quarters .
Inter-Quartile Range (IQR)

Inter-Quartile Range (IQR)

The IQR ( Inter-Quartile Range ) is a measure of variability, based on dividing a dataset into quartiles. First Quartile (Quartile-1) is denoted by Q1 known as the lower quartile, the second Quartile (Quartile-2) is denoted by Q2 and the third Quartile (Quartile-3) is denoted by Q3 known as the upper quartile. The interquartile range is found by subtracting the Q1 value from the Q3 value. IQR = Q3 - Q1

How to find Inter-Quartile Range?

Suppose you have a dataset:

60, 110, 30, 10, 40, 20, 100, 90, 70, 80, 50.

Put the numbers in order.

10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110.

Find the median of dataset

Total eleven numbers in the dataset. So, the median is sixth value: Median value is : 60

Separate the numbers above and below the median.

(10, 20, 30, 40, 50) 60 (70, 80, 90, 100, 110)
In the above case Q2 is 60 . So, find the Q1 and Q3.

Q1 is the median of first half of dataset and Q3 is the median of second half of the dataset.

(10, 20, [30], 40, 50) 60 (70, 80, [90], 100, 110)
Q1 = 30 and Q3 = 90.

Inter-Quartile Range (IQR) = Q3 - Q1

90-30 = 60


Inter-Quartile Range (IQR) = 60

Variance

The Variance measures the average degree to which each point differs from the mean. In order to find out the variance, first calculate the difference between each point from the mean, square it , and then average the result.

How to calculate the Variance?

  1. Step-1 : Find the mean of the dataset.
  2. Step-2 : Calculate difference from Mean.
  3. Step-3 : Square each value.
  4. Step-4 : Average it.

Suppose you have a dataset:

600, 470, 170, 430, 300

Step-1 : Find the mean of the dataset

First, you have to find the mean of the dataset.
600 + 470 + 170 + 430 + 300 ---------------------------- 5
1970 ----- 5
= 394

Mean of the above dataset is : 394

Step-2 : Calculate difference from Mean

Next step is to calculate the difference of each values in the dataset from the Mean value:

Dataset : 600, 470, 170, 430, 300

600 - 394 = 206
470 - 394 = 76
170 - 394 = -224
430 - 394 = 36
300 - 394 = -94

Step-3 : Square each value

Square each value .
206 * 206 = 42436
76 * 76 = 5776
(-224) * (-224) = 50176
36 * 36 = 1296
(-94) * (-94) = 8836

Step-4 : Average it

Average the result .
42436 + 5776 + 50176 + 1296 + 8836 ---------------------------------- 5
= 108520/5
= 21704

Variance of the above dataset is : 21704

Standard Deviation

The Standard Deviation is a measure of how spread out numbers are. It is calculated as the square root of variance by figuring out the variation between each data point relative to the mean. Standard deviation (S) = square root of the variance . Calculate the Standard deviation of the following dataset:
600, 470, 170, 430, 300

Variance of the above dataset is : 21704

Standard deviation : square root of (21704) = 147