How to get the min and max values of an array in Polars

The Polars library is a fast DataFrame library implemented in Rust and designed for performance and ease of use. It provides a data manipulation tool. Polars is particularly efficient for large datasets and parallel computing.

The `max` and `min` functions in Polars

In Polars, the Expr.arr.max() and Exp.arr.min() functions are used to compute the maximum and minimum values, respectively, of subarrays within a column of a DataFrame. These functions are part of the expression API in Polars, which allows us to perform various operations on DataFrame columns.

Syntax of the `max()` function

Here’s the syntax of the max() function:

Parameters

Expr: It represents a Polars expression, typically a column in a DataFrame.
arr: It refers to the array type.
min(): It computes the minimum values of the subarrays within the column of a DataFrame.

The Exp.arr.min() function returns the minimum value within subarrays of a column in a DataFrame.

These functions in Polars are essential for extracting key insights and performing aggregations within subarrays in a DataFrame’s column. By utilizing these functions, analysts and data scientists can efficiently compute the maximum and minimum values within each subarray, facilitating statistical analysis, feature engineering, and data cleaning tasks. These functions are particularly valuable in scenarios where data is organized as arrays, such as stock prices over time, measurements at different timestamps, or temperature readings at various locations.

Code examples

Let’s consider a simple example where we have a DataFrame with a column named a, and we want to find the maximum values from the subarrays given in column a.

Explanation

Let’s discuss the code above step by step:

Lines 3–6: We create a DataFrame df using the pl.DataFrame constructor. The DataFrame has one column named a, and the data for a is provided as a list of lists ([[34, 3], [23, 2]]). The schema is explicitly defined with pl.Array(inner=pl.Int64, width=2), and specifies that column a consists of an array of integers with a width of 2.
Line 7: We create a new DataFrame Max_val by selecting the a column from the original DataFrame (pl.col("a")) and then finding the maximum value within each array in that column using the arr.max() function.
Line 9: We print the DataFrame Max_val, which contains the maximum value for each array in the a column.

Finding minimum values in arrays

Now, we’ll take minimum values from the subarrays. We have a DataFrame with a column named a.

Explanation

Let’s discuss the code above step by step:

Lines 2–6: We create DataFrame df using the pl.DataFrame constructor. The DataFrame has two columns, a and b, and the data for both columns is provided as lists of lists ([[1, 2], [4, 3]] for a and [[34, 3], [23, 2]] for b). The schema is explicitly defined for both columns, specifies that column a and column b consist of an array of integers with a width of 2.
Line 8: We create a new DataFrame Max_val by selecting both a and b columns from the original DataFrame (pl.col("a", "b")) and then finding the maximum value within each subarray in these columns using the arr.max() function.
Line 9: We print the DataFrame Max_val, which contains the maximum value for each subarray in both a and b columns.

Finding the minimum values in arrays across multiple columns

Now, we’ll take the minimum values from an array. We have a DataFrame with two columns named a and b.

The code above is essentially the same as the one in which we found the maximum values from subarrays across multiple columns. Here, line 8 is taking the minimum values from both the columns a and b using the Exp.arr.min() function.

In conclusion, the Exp.arr.min() and Exp.arr.max() functions in Polars are essential tools for data analysis, allowing us to quickly obtain insights into the range of values in our dataset. They are particularly useful when working with large datasets where performance is crucial.

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources