The var()
function in pandas obtains the variance of the values of a specified axis of a given DataFrame.
Mathematically, variance is defined as the measure of the spread between the values of a data set.
It takes the formula below:
S2 =
Where:
In another context, the variance of a dataset is given as √standard deviation
. That is, the square root of the standard deviation.
The var()
function takes the following syntax:
DataFrame.var(axis=NoDefault.no_default, skipna=True, numeric_only=None, **kwargs)
The var()
function takes the following optional parameter values:
axis
: This represents the name of the row (designated as 0
or 'index'
) or the column (designated as 1
or columns
) axis.skipna
: This takes a boolean value indicating whether NA or null values are to be excluded.ddof
: This takes an int
that represents the delta degrees of freedom. numeric_only
: This takes a boolean value indicating whether to include only float, int, or boolean columns.**kwargs
: This is an additional keyword argument that can be passed to the function.The var()
function returns a DataFrame object holding the results.
# A code to illustrate the var() function in Pandas # Importing the pandas library import pandas as pd # Creating a DataFrame df = pd.DataFrame([[1,2,3,4,5], [1,7,5,9,0.5], [3,11,13,14,12]], columns=list('ABCDE')) # Printing the DataFrame print(df) # Obtaining the median value vertically across rows print(df.var()) # Obtaining the median value horizontally over columns print(df.var(axis="columns"))
pandas
library.df
.df
.var()
function, we obtain the variance of the values that run downwards across the rows (axis 0
). We print the result to the console.var()
function, we obtain the variance of values that run horizontally across columns (axis 1
). We print the result to the console.RELATED TAGS
CONTRIBUTOR
View all Courses