What is DataFrame.sum in Polars?

Polars is a fast and efficient data manipulation library written in Rust. It’s designed to provide high-performance operations on large datasets and handles them more quickly than pandas. It’s particularly suitable when working with tabular data.

Let’s import the library we need in order to run the DataFrame.sum() function.

Import the library

First, let’s import the polars library.

import polars as pl
Importing the polars library

After importing the library, let’s examine in detail how DataFrame.sum() works.

The DataFrame.sum() method

DataFrame.sum() is a method used to compute the sum of values for each column in a DataFrame. This method takes in a DataFrame as input and it returns a new DataFrame with a single row that contains the total value for each numeric column. The maximum value is calculated independently for each column. This means that the total value of each column is computed separately, irrespective of the values in other columns.

By default, the DataFrame.sum() method ignores missing values (null or NaN) during the computation. If a column contains missing values, the total value will be computed excluding those missing values.

Note: The DataFrame.sum() method considers only the numeric columns for the computation of the total values. Non-numeric columns, such as string or boolean columns, are ignored during the calculation.

Code

import polars as pl
# Create a DataFrame with mixed data types
data = {'A': [1, 2, 3], 'B': [4, None, 6],
'C': [7, 8, 9], 'D': ['foo', 'bar', 'baz']}
df = pl.DataFrame(data)
# Compute the total values for each column
sum_values = df.sum()
print(sum_values)

Explanation

Line 1: We import the polars library as pl.

Lines 4–6: We create the DataFrame df which contains a mix of numeric and non-numeric columns.

Line 9: We use the df.sum() method that returns the DataFrame that computes values for numeric columns A, B (excluding the missing value None), and C, while ignoring the non-numeric column D.

Line 11: We print the sum_values DataFrame that contains the minimum values [6, 10, 24, null] for the corresponding columns. As D has non-numeric values, null is returned.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved