Measures of the Spread of Data
Summarising the dataset can help you to understand the data, especially when the dataset is too large. A Measures of Central Tendency of a dataset by itself is not enough, though, to describe a distribution. The Measures of the Spread , sometimes also called a measure of dispersion, of dataset describe you how extreme the values in the dataset are. It summarise the dataset in a way that explain how scattered the values are and how much they differ from the mean value.There are several basic measures of spread used in statistics. The most common are:
- Range
- Inter-Quartile Range (IQR)
- Variance
- Standard Deviation
Range
The Range is Define and calculate the range of a dataset. It is the difference between the smallest value in a dataset (Minimum) and the largest one (Maximum). Range = maximum - minimumSuppose you have a dataset of some values:
12, 48, 32, 21, 32, 36, 54, 21, 78, 32, 18, 94.
Minimum value is : 12
Maximum value is : 94
Range of dataset is : 94 - 12 = 82
Quartile and Inter-Quartile Range (IQR)
Quartile
Quartiles are the values that divide a dataset into quarters .
Inter-Quartile Range (IQR)
The IQR ( Inter-Quartile Range ) is a measure of variability, based on dividing a dataset into quartiles. First Quartile (Quartile-1) is denoted by Q1 known as the lower quartile, the second Quartile (Quartile-2) is denoted by Q2 and the third Quartile (Quartile-3) is denoted by Q3 known as the upper quartile. The interquartile range is found by subtracting the Q1 value from the Q3 value. IQR = Q3 - Q1How to find Inter-Quartile Range?
Suppose you have a dataset:
60, 110, 30, 10, 40, 20, 100, 90, 70, 80, 50.
Put the numbers in order.
10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110.
Find the median of dataset
Total eleven numbers in the dataset. So, the median is sixth value: Median value is : 60Separate the numbers above and below the median.
(10, 20, 30, 40, 50) 60 (70, 80, 90, 100, 110)
In the above case Q2 is 60 . So, find the Q1 and Q3.
Q1 is the median of first half of dataset and Q3 is the median of second half of the dataset.
(10, 20, [30], 40, 50) 60 (70, 80, [90], 100, 110)
Q1 = 30 and Q3 = 90. Inter-Quartile Range (IQR) = Q3 - Q1
90-30 = 60
Inter-Quartile Range (IQR) = 60
Variance
The Variance measures the average degree to which each point differs from the mean. In order to find out the variance, first calculate the difference between each point from the mean, square it , and then average the result.How to calculate the Variance?
- Step-1 : Find the mean of the dataset.
- Step-2 : Calculate difference from Mean.
- Step-3 : Square each value.
- Step-4 : Average it.
Suppose you have a dataset:
600, 470, 170, 430, 300
Step-1 : Find the mean of the dataset
First, you have to find the mean of the dataset.
600 + 470 + 170 + 430 + 300
----------------------------
5
1970
-----
5
= 394
Mean of the above dataset is : 394
Step-2 : Calculate difference from Mean
Next step is to calculate the difference of each values in the dataset from the Mean value:
Dataset : 600, 470, 170, 430, 300
600 - 394 = 206
470 - 394 = 76
170 - 394 = -224
430 - 394 = 36
300 - 394 = -94
Step-3 : Square each value
Square each value .
206 * 206 = 42436
76 * 76 = 5776
(-224) * (-224) = 50176
36 * 36 = 1296
(-94) * (-94) = 8836
Step-4 : Average it
Average the result .
42436 + 5776 + 50176 + 1296 + 8836
----------------------------------
5
= 108520/5
= 21704
Variance of the above dataset is : 21704
Standard Deviation
The Standard Deviation is a measure of how spread out numbers are. It is calculated as the square root of variance by figuring out the variation between each data point relative to the mean. Standard deviation (S) = square root of the variance . Calculate the Standard deviation of the following dataset:
600, 470, 170, 430, 300
Variance of the above dataset is : 21704
Standard deviation : square root of (21704) = 147
Related Topics