Python - Mean(), Median(), Mode()
Central Tendency serves as a crucial concept, representing a central or prototypical value within a probability distribution. This fundamental notion enables analysts to identify the core, average, or most frequently occurring value in a given dataset. Measures of central tendency, therefore, play an essential role in unraveling the essence of a dataset and understanding its central characteristics.
- Mean : The mean value is the average value.
- Median : The median value is the value in the middle, after you have sorted all the values.
- Mode : The Mode value is the value that appears the most number of times.
Find the mean, median and mode from a Pandas column using Python
Lets create a DataFrame...
Calculate Mean
The Mean stands out as one of the most prevalent measures of central tendency. Calculated as the sum of all data points divided by the total number of observations, the Mean presents a comprehensive overview of the data's average value. Its mathematical elegance and simplicity make it a widely favored tool for capturing the typical magnitude of a dataset.
If you want to find out the mean of a single column in a DataFrame:
df['column_name'].mean()By default, mean is calculated every single column (axis=0) in the DataFrame. If you Pass the argument of (axis=1) will return the mean of every single row in the DataFrame .
Calculate Median
The Median is the middle value when the dataset is ordered from the smallest to the largest (or vice versa). Unlike the Mean, the Median is less affected by extreme outliers, making it a robust representation of the central value, particularly in datasets with skewed distributions.
If you want to find out the median of a single column in a DataFrame:
df['column_name'].median()By default, median is calculated every single column (axis=0) in the DataFrame. If you Pass the argument of (axis=1) will return the median of every single row in the DataFrame .
Calculate Mode
The Mode deserves special mention as yet another essential measure of central tendency. The Mode refers to the value that appears most frequently within the dataset. It stands as a valuable indicator of the most prevalent occurrence, providing insights into the dominant characteristic of the data.
Summary Statistics
This function gives you several useful things all at the same time.
Conclusion
Using these measures of central tendency, analysts can discern the core characteristics of a dataset, thereby illuminating critical patterns and trends. Understanding the central tendencies of data is a foundational step in statistical analysis, paving the way for further exploration and decision-making processes across various fields, including finance, economics, social sciences, and more.