What is the DataFrame.update function in Polars Python?
Polars, a Rust DataFrame library with Python bindings, excels in high-performance data processing for large datasets. Its parallel processing capabilities and support for various data sources make it an ideal choice for efficient tabular data management with better performance than pandas.
Here, we’ll discuss the update() function of the Polars library.
The update() function
The update() function helps us to merge or align two DataFrames, updating values in the target DataFrame with non-null values from the source DataFrame. This function facilitates the integration of new data into an existing DataFrame, allowing for flexible updating strategies, such as inner, outer, and explicit column-based joins.
Syntax
Here’s the syntax for the update() function:
DataFrame.update(other: DataFrame,on: str | Sequence[str] | None = None,how: Literal['left', 'inner', 'outer'] = 'left',*,left_on: str | Sequence[str] | None = None,right_on: str | Sequence[str] | None = None,include_nulls: bool = False,) → DataFrame[source]
In the syntax above:
otherpasses a DataFrame required to update the original one.onrepresents the column names that will be joined on.howdefines the merging approach:leftretains all left rows,innerkeeps matching keys, andouterupdates existing matches while adding new rows.left_onjoins columns of the left DataFrame.right_onjoins columns of the right DataFrame.include_nullsstates that null values in the right DataFrame will be utilized to update the left DataFrame.
Code example
Let’s discuss a coding example to better understand how this function works:
import polars as pldf = pl.DataFrame({"EmployeeName": ["John", "Smith", "David", "Ronaldo"],"Age": [24, 32, 19, 26],"Salary": [100, 300, 250, 320]})new_df = pl.DataFrame({"Salary": [140, 330, None],"LastName": ["Dan", "Rohn", "Michel"],})# Simply update DataFrameprint(df.update(new_df))# Update DataFrame by keeping those rows that are commonprint(df.update(new_df, how="inner"))# Update DataFrame containing all rows in both DataFramesprint(df.update(new_df, how="outer"))# Explicitly joining columns in each Dataframe, including null valuesprint(df.update(new_df, left_on="EmployeeName", right_on="LastName", how="outer", include_nulls=True))
Code explanation
Let’s discuss the above code in detail.
Line 1: We import the
polarslibrary aspl.Lines 3–9: We create a DataFrame named
dfcontainingEmployeeName,Age, andSalarycolumns.Lines 11–16: We create another DataFrame to update the previous one.
Line 19: We call the
update()function to update thedfDataFrame values with thenew_df.Line 22: We update DataFrame by keeping those rows that are common by passing
how="inner"argument to the function.Line 25: We update DataFrame containing all rows in both DataFrames by passing
how="outer"argument to the function.Line 28: We join all columns in each Dataframe, including null values, by passing
include_nulls=Trueargument.
Unlock your potential: Polars in Python series, all in one place!
To continue your exploration of Polars, check out our series of Answers below:
How to scale and normalize data in Python using Polars
Learn how to transform raw data using Python's Polars library to scale it (0-1) and normalize it (mean 0, std 1).What is DataFrame.clear function in Polars Python?
Learn how to use Polars'DataFrame.clear()to create a null-filled copy, either empty ifn=0or withnnull rows.How to reverse a DataFrame in Polars Python?
Learn how to use Polars, a Rust-based DataFrame library for Python, which offers areverse()function to efficiently revert DataFrame rows, providing an alternative to pandas.How to rename the column names in Polars Python?
Learn how to use Polars'rename()function to efficiently rename DataFrame columns using key-value pairs, enhancing data management and processing.What is Polars library in Python?
Learn how Polars, a fast DataFrame library in Rust for Python, offers high-performance data manipulation and analysis similar to Pandas.How to concatenate two Dataframes in Polars Python
Learn how Polars, leveraging Rust, offers efficient DataFrame concatenation in Python with theconcat()method.How to perform a transpose of a Python Polars DataFrame
Learn how to use Polars'DataFrame.transpose()to efficiently transpose DataFrames, with options for including headers and custom column names, enhancing data manipulation capabilities.How to check the polars version in Python
Learn how to ensure the correct Polars version by usingpip3 show polarsor by printingpl.__version__in Python.What is DataFrame.update function in Polars Python?
Learn how to use theupdate()function in Polars to merge two DataFrames, updating the target with non-null values from the source, and supporting various join strategies.
Free Resources