What is the DataFrame.limit() function in Polars Python?
Polars, a high-performance DataFrame library, is implemented in Rust with Python bindings. It excels at processing vast datasets, offering a compelling alternative to pandas. Its focus on parallel processing ensures speedy operations, making it an excellent choice for big data tasks. Moreover, Polars boasts versatile data source support, including CSV, Parquet, Arrow, and more. This library truly shines when dealing with tabular data, offering efficiency and speed in Rust’s signature style.
In this Answer, we’ll go through the limit() method of the Polars library.
The limit function
The limit function in the Polars library retrieves the first n number of rows from a DataFrame. It helps us retrieve only those sets of rows that are required instead of getting the whole bulk of the DataFrame. If a negative value is passed to the function, it’ll pass all rows except the last abs(n) rows, where abs function returns the absolute value of the number n.
Syntax
Here’s the syntax for the limit function:
DataFrame.limit(n)
Here n refers to the number of rows we want to retrieve from the given DataFrame.
Code
Let’s start with importing the polars library and giving it the alias pl to make it easier to refer to in the code in the following way:
import polars as pl
Here’s the coding example of the limit function:
import polars as pldf = pl.DataFrame({"key": [["a", "b"],["c", "d"],["e", "f"],["g", "h"],["i", "j"],["k", "l"],["m", "n"],["o", "p"],["q", "r"]],"val": [[1, 2],[3, 4],[5, 6],[7, 8],[9, 10],[11, 12],[13, 14],[15, 16],[17, 18]]})print("- Display the original DataFrame\n", df)print("- Display first five rows of DataFrame\n", df.limit(5))print("- Display all rows of DataFrame except the last two rows\n", df.limit(-2))
Code explanation
Let’s discuss the above code in detail.
-
Line 3: We create the DataFrame named
df. -
Lines 3–26: We initialize the DataFrame with two columns:
"key"and"val". The"key"column contains lists of strings, and the"val"column contains lists of integers. -
Line 27: We display the created DataFrame.
-
Line 28: We call the
limitfunction to display the first five records of the given DataFrame. -
Line 29: We call the
limitfunction to display all the records of the given DataFrame, excluding the last two rows
The limit() function in the Polars library is a valuable tool for improving data retrieval efficiency in DataFrame operations, particularly when working with large datasets, making it a key feature for data professionals.
Free Resources