Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

pandas

How to obtain the variance over a specified axis in pandas

Onyejiaku Theophilus Chidalu

Overview

The var() function in pandas obtains the variance of the values of a specified axis of a given DataFrame.

Mathematically, variance is defined as the measure of the spread between the values of a data set.

It takes the formula below:

S2 =Σ(xix)n1\frac{Σ(xi -x)}{n-1}

Where:

  • S2 = variance
  • xi = value of the dataset
  • x = the number of values in the dataset

In another context, the variance of a dataset is given as √standard deviation. That is, the square root of the standard deviation.

Syntax

The var() function takes the following syntax:

DataFrame.var(axis=NoDefault.no_default, skipna=True, numeric_only=None, **kwargs)
Syntax for the var() function in Pandas

Parameter values

The var() function takes the following optional parameter values:

  • axis: This represents the name of the row (designated as 0 or 'index') or the column (designated as 1 or columns) axis.
  • skipna: This takes a boolean value indicating whether NA or null values are to be excluded.
  • ddof: This takes an int that represents the delta degrees of freedom.
  • numeric_only: This takes a boolean value indicating whether to include only float, int, or boolean columns.
  • **kwargs: This is an additional keyword argument that can be passed to the function.

Return value

The var() function returns a DataFrame object holding the results.

Example

# A code to illustrate the var() function in Pandas

# Importing the pandas library
import pandas as pd

# Creating a DataFrame
df = pd.DataFrame([[1,2,3,4,5],
                   [1,7,5,9,0.5],
                   [3,11,13,14,12]],
                   columns=list('ABCDE'))
# Printing the DataFrame
print(df)

# Obtaining the median value vertically across rows
print(df.var())

# Obtaining the median value horizontally over columns
print(df.var(axis="columns"))
Implementing the var() function

Explanation

  • Line 4: We import the pandas library.
  • Lines 7–10: We create a DataFrame, df.
  • Line 12: We print df.
  • Line 15: Using the var() function, we obtain the variance of the values that run downwards across the rows (axis 0). We print the result to the console.
  • Line 18: Using the var() function, we obtain the variance of values that run horizontally across columns (axis 1). We print the result to the console.

RELATED TAGS

pandas

CONTRIBUTOR

Onyejiaku Theophilus Chidalu
RELATED COURSES

View all Courses

Keep Exploring