Relationship Between Variables

In Data Science, very frequently you have to determine the strength of the association of two or more variables. Statistics can be performed using a single variable (Univarate statistics) , two variables (Bivariate statistics) , and even multiple variables (Multivariate statistics).

Univarate Statistics

When you use a single variable to perform analysis, where most of the descriptive statistics lie, such statistics are called univariate statistics.

Bivariate Statistics

When you use two variables to perform analysis, which is generally the case with inferential statistics where you are trying to assess the relationship between the two samples, this statistic is commonly called a Bivariate Statistics.

Multivariate Statistics

When you have multiple variables where you simultaneously assess the relationship using multiple variables, this is known as multivariate statistics.


Correlation is a statistical technique that can perform whether and how strongly pairs of variables are related. If the two variables move in the same direction , then those variables are said to have a positive correlation. If they move in opposite directions , then they have a negative correlation.

Correlation Coefficient

The Correlation Coefficient is a statistical measure of the strength of the linear association between the relative movements of two variables. The correlation coefficient always takes a value between -1 and 1.
  1. 1 indicates a strong positive relationship.
  2. -1 indicates a strong negative relationship.
  3. 0 indicates no relationship at all.


In statistics, a Covariance refers to the measure of how two random variables will change when they are compared to each other. In other words, it defines the changes between the two variables , such that change in one variable is equal to change in another variable. Unlike the correlation coefficient , covariance is measured in units. The units are computed by multiplying the units of the two variables.


Causation indicates a relationship between two events where one event is affected by the other. It explicitly applies to cases where action A causes outcome B i.e. there is a causal relationship between the two events. This is also referred to as cause and effect . For example: when the value of one event, or variable, increases or decreases as a result of other events, it is said there is causation .
Pearson's Correlation

Pearson's Correlation

Pearson correlation coefficient "r" is defined in statistics as the measurement of the strength of the linear relationship or association between two continuous variables and their association with each other. The Pearson correlation method assigns a value between - 1 and 1, where 0 is no correlation, 1 is total positive correlation, and - 1 is total negative correlation. Pearson correlations are only suitable for quantitative variables (including dichotomous variables).

Spearman's Correlation

Spearman's correlation is a statistical measure of the strength and direction of association that exists between two variables measured on at least an ordinal scale .