What does the polars.sum_horizontal() function do?
Polars is a fast and efficient data manipulation library written in Rust.
It’s designed to provide high-performance operations on large datasets and handles them more quickly than pandas. It’s particularly suitable when working with tabular data. One of the useful functions of Polars is DataFrame.sum_horizontal, which allows you to compute the sum across rows, aggregating values horizontally across specified columns.
This function is especially beneficial when you need to perform row-wise computations, such as summing up related features in a dataset, without manually iterating through rows.
The sum_horizontal() method
The sum_horizontal() function computes the sum of values across columns in a DataFrame horizontally.
Syntax
Below is the syntax of sum_horizontal() function:
pl.sum_horizontal(*exprs)
Parameters
*exprs: It represents the column(s) that are to be aggregated. It accepts the expression input. Strings are parsed as column names; other non-expression inputs are parsed as literals.
Return value
It returns a Series type object that represents the sum of values for each row in the DataFrame.
Look at the slides below for further understanding.
Code
To demonstrate the use of sum_horizontal() function, we will take an example:
import polars as pl# Creating a DataFramedata = pl.DataFrame({"alpha": [10, 20, 30, 40],"beta": [5.0, 15.0, 25.0, 30.0],"gamma": [2, 4, 6, 6],})# Use sum_horizontal to compute the sum across columns for each rowresult_sum = data.select([pl.sum_horizontal(["alpha", "beta", "gamma"]).alias("sum_horizontal")])# Display the resultprint(result_sum)
Explanation
Line 4–10: We create a new DataFrame,
data, that has three columns (alpha,beta, andgamma) and four rows, each containing corresponding numeric values.Line 13–15: The
sum_horizontal()function is applied todatato calculate the sum across columns for each row. The DataFrame title is set as "sum_horizontal" usingalias("sum_horizontal"). Next, theselect()method is applied to this result to produce a new DataFrame with the computed sums.Line 17: We print the
result_sumDataFrame.
Conclusion
The pl.sum_horizontal() function simplifies the process of creating composite metrics or overall scores from multiple data points. It is useful in contexts where data needs to be aggregated across multiple columns for each row, such as in financial analysis, survey scoring, quality control, fitness tracking, and environmental monitoring.
Free Resources