Let’s import the library we need in order to run the DataFrame.sum()
function.
First, let’s import the polars
library.
import polars as pl
After importing the library, let’s examine in detail how DataFrame.sum()
works.
DataFrame.sum()
methodDataFrame.sum()
is a method used to compute the sum of values for each column in a DataFrame. This method takes in a DataFrame as input and it returns a new DataFrame with a single row that contains the total value for each numeric column. The maximum value is calculated independently for each column. This means that the total value of each column is computed separately, irrespective of the values in other columns.
By default, the DataFrame.sum()
method ignores missing values (null
or NaN
) during the computation. If a column contains missing values, the total value will be computed excluding those missing values.
Note: The
DataFrame.sum()
method considers only the numeric columns for the computation of the total values. Non-numeric columns, such as string or boolean columns, are ignored during the calculation.
import polars as pl# Create a DataFrame with mixed data typesdata = {'A': [1, 2, 3], 'B': [4, None, 6],'C': [7, 8, 9], 'D': ['foo', 'bar', 'baz']}df = pl.DataFrame(data)# Compute the total values for each columnsum_values = df.sum()print(sum_values)
Line 1: We import the polars
library as pl
.
Lines 4–6: We create the DataFrame df
which contains a mix of numeric and non-numeric columns.
Line 9: We use the df.sum()
method that returns the DataFrame that computes values for numeric columns A
, B
(excluding the missing value None
), and C
, while ignoring the non-numeric column D
.
Line 11: We print the sum_values
DataFrame that contains the minimum values [6, 10, 24, null]
for the corresponding columns. As D
has non-numeric values, null
is returned.
Free Resources