What is the statistics correlation() method in Python?
Overview
The statistics module in Python comes with many statistical functions that help analyze numerical data.
The statistics.correlation() method in Python is used to return Pearson’s correlation coefficient between two inputs.
The Pearson’s correlation formula is:
Where:
= The correlation coefficient. It is usually between -1 (negative correlation) and +1 (positive correlation). When the value is zero, it means that there is no correlation between the inputs.
= The values of the x dataset.
= The mean values of the x dataset.
= The values of the y dataset.
= The mean value of the y dataset.
Syntax
statisticcs.corrrelation(x,y,/)
Parameters
The statistics.correlation() method takes the x and y parameters which represent the x and y values for which the correlation coefficient is to be determined.
Return value
The statistics.correlation() method returns the Pearson’s correlation coefficient for two given inputs.
Example
Let’s use the statistics.correlation() method to determine the Pearson’s correlation coefficient
for two inputs, x and y:
import numpy as npx = [11, 2, 7, 4, 15, 6, 10, 8, 9, 1, 11, 5, 13, 6, 15]y = [2, 5, 17, 6, 10, 8, 13, 4, 6, 9, 11, 2, 5, 4, 7]# to return the upper three quartilespearsons_coefficient = np.corrcoef(x, y)print("The pearson's coeffient of the x and y inputs are: \n" ,pearsons_coefficient)
Explanation
- Line 1: We import
numpyas a module. - Lines 2 and 3: We make two datasets,
xandy. - Line 6: We calculate the coefficient using the
np.corrcoef(x, y)function and assign the result to a variable,pearsons_coefficient.