What is the polars.from_pandas method in Polars?
Polars is a fast and efficient data manipulation library written in Rust. It’s designed to provide high-performance operations on large datasets and handles them more quickly than pandas. It’s particularly suitable when working with tabular data.
The polars.from_pandas() method is useful for converting pandas DataFrames into Polars DataFrames, offering significant performance improvements, especially with large datasets. It enables faster data manipulation through parallel processing, lower memory usage, and enhanced efficiency, making it ideal for data-intensive workflows like machine learning, ETL pipelines, and real-time analytics. Additionally, it provides an easy transition from pandas to Polars, allowing for seamless integration in memory-constrained environments or when optimizing existing pandas-based projects.
Import the library
First, let’s import the polars library.
import polars as pl
After importing the library, let’s examine in detail how polars.from_pandas() works.
The polars.from_pandas() method
The polars.from_pandas() method converts pandas DataFrame/Series to the polars DataFrame/Series respectively.
Syntax
polars.from_pandas( data, schema=None, nan_to_null = True, include_index = False)
Parameters
The parameters are described below:
data: It is represented as a pandas DataFrame, Series, or Index.schema: This is an optional parameter. If provided, it allows us to specify a schema for the resulting Polars DataFrame. If not provided (default isNone), Polars will attempt to infer the schema from the input pandas data.nan_to_null: This is also an optional parameter. The default value of the parameter isTrue. If set toTrue, it means that NaN values present in the input pandas data will be transformed intonullvalues in the resulting Polars DataFrame. If the value is set toFalse, NaN values will be preserved in their original form.include_index: This is also an optional parameter, and its default value isFalse. If set toTrue, it indicates that the index information from the input pandas DataFrame or Series should be included in the resulting Polars DataFrame. If set toFalse, the index information is not included.
Return value
The method returns a Polars DataFrame if
datais pandas DataFrame .The method returns a Polars Series if
datais pandas Series or index.
Code
import pandas as pdimport polars as pl# Creating a pandas DataFramepd_df = pd.DataFrame([[1, 2, 3], [0, 1, 2]], columns=["A", "B", "C"])# Printing the pandas DataFrameprint(pd_df)# Converting pandas DataFrame to a Polars DataFramedf = pl.from_pandas(pd_df)# Printing the Polars DataFrameprint(df)
Explanation
Lines 1–2: We import the
polarsandpandaslibrary asplandpdrespectively.Line 5: We create the pandas DataFrame named
pd_df. The DataFrame is initialized with a 2 x 3 matrix (2 rows, 3 columns) containing numeric values.Line 8: We print the pandas DataFrame.
Line 11: We use the
from_pandas()method to convert the previously created pandas DataFrame (pd_df) to a Polars DataFrame (df).Line 14: We print the polars DataFrame.
Free Resources