Karl Pearson's Coefficient of Correlation

Pearson’s Correlation Coefficient is one of the most widely used statistical measures for determining the strength and direction of the relationship between two variables. Also known as the Product Moment Correlation, it measures how closely two variables move together. It is represented by r, a dimensionless value ranging from −1 to +1, where +1 indicates a perfect positive correlation, −1 indicates a perfect negative correlation and 0 represents no linear relationship between the variables.

Formula

Karl~Pearson's~Coefficient~of~Correlation =\frac{Sum~of~Products~of~Deviations~from~their~respective~means}{Number~of~Pairs\times{Standard~Deviations~of~both~Series}}

r=\frac{\sum{xy}}{N\times{\sigma_x}\times{\sigma_y}}

Where:

N = Number of Pair of Observations
x = Deviation of X series from Mean (X-\bar{X})
y = Deviation of Y series from Mean (Y-\bar{Y})
\sigma_x= Standard Deviation of X series (\sqrt{\frac{\sum{x^2}}{N}})
\sigma_y= Standard Deviation of Y series (\sqrt{\frac{\sum{y^2}}{N}})
r = Coefficient of Correlation

Example of Using Pearson’s Correlation

X	12	16	20	24	28	32	36
Y	6	9	12	15	18	21	24

Where:

N=7
∑xy=336
σx=8
σy=6

\begin{aligned}\sigma_x &= \sqrt{\frac{\sum x^2}{N}} \\\sigma_x &= \sqrt{\frac{448}{7}} = \sqrt{64} = 8 \\\\\sigma_y &= \sqrt{\frac{\sum y^2}{N}} \\\sigma_y &= \sqrt{\frac{252}{7}} = \sqrt{36} = 6\end{aligned}

r = \frac{336}{7 \times 8 \times 6}~~r = \frac{336}{336}~~r = 1

The value r=1 indicates a perfect positive correlation, meaning both variables increase proportionally together.

Methods of Calculating Karl Pearson's Coefficient of Correlation

Actual Mean Method
Direct Method
Short-Cut Method/Assumed Mean Method/Indirect Method
Step-Deviation Method

1. Actual Mean Method

This method calculates correlation using deviations from the actual means of both series.

Formula:

r=\frac{\sum{xy}}{\sqrt{\sum{x^2}\times{\sum{y^2}}}}

2. Direct Method

The Direct Method calculates correlation using the original values of the series without finding deviations separately.

Formula:

r=\frac{N\sum{XY}-\sum{X}.\sum{Y}}{\sqrt{N\sum{X^2}-(\sum{X})^2}{\sqrt{N\sum{Y^2}-(\sum{Y})^2}}}

3. Short-Cut Method/Assumed Mean Method

This method simplifies calculations by taking deviations from assumed means instead of actual means.

Formula:

r=\frac{N\sum{dxdy}-\sum{dx}.\sum{dy}}{\sqrt{N\sum{dx^2}-(\sum{dx})^2}{\sqrt{N\sum{dy^2}-(\sum{dy})^2}}}

∑dx = Sum of deviations of X values from assumed mean
∑dy = Sum of deviations of Y values from assumed mean

4. Step Deviation Method

The Step Deviation Method further simplifies calculations by taking deviations from an assumed mean and dividing them by a common factor CCC.

Formula:

r=\frac{N\sum{dx^\prime{dy^\prime}}-\sum{dx^\prime}.\sum{dy^\prime}}{\sqrt{N\sum{dx^\prime{^2}}-(\sum{dx^\prime})^2}{\sqrt{N\sum{dy^\prime{^2}}-(\sum{dy^\prime})^2}}}

∑dx′ = Sum of deviations of X values from assumed mean
∑dy′ = Sum of deviations of Y values from assumed mean

Python Implementation

Here we will use NumPy for numerical computations, while np.array() stores the values of X and Y as arrays.
The code calculates the mean, deviations from the mean and standard deviations for both variables using NumPy functions.
np.corrcoef() calculates the Pearson correlation coefficient between two variables. The value ranges from −1 to +1, representing negative, no or positive correlation.

Python

import numpy as np

# Sample data
X = np.array([12, 16, 20, 24, 28, 32, 36])
Y = np.array([6, 9, 12, 15, 18, 21, 24])

# Mean of X and Y
mean_x = np.mean(X)
mean_y = np.mean(Y)

# Deviations from mean
x = X - mean_x
y = Y - mean_y

# Standard deviations
sigma_x = np.sqrt(np.sum(x**2) / len(X))
sigma_y = np.sqrt(np.sum(y**2) / len(Y))

# Pearson correlation coefficient
r = np.sum(x * y) / (len(X) * sigma_x * sigma_y)

print("Pearson Correlation Coefficient:", r)

Output:

Pearson Correlation Coefficient: 1.0

Karl Pearson's Coefficient of Correlation

Formula

Example of Using Pearson’s Correlation

Methods of Calculating Karl Pearson's Coefficient of Correlation

1. Actual Mean Method

2. Direct Method

3. Short-Cut Method/Assumed Mean Method

4. Step Deviation Method

Python Implementation

Explore