What is Dataframe.melt method in Polars?
The DataFrame.melt() method
In Polars, the melt operation transforms a DataFrame from a wide format to a long format by reshaping columns into rows.
It enables the reorganization of tabular data by converting columns, representing measured variables, into rows. This results in two essential columns: one for identifiers and another for corresponding values.
Syntax
The syntax of the DataFrame.melt() method is mentioned below:
DataFrame.melt(id_vars: ColumnNameOrSelector | Sequence[ColumnNameOrSelector] | None = None,value_vars: ColumnNameOrSelector | Sequence[ColumnNameOrSelector] | None = None,variable_name: str | None = None,value_name: str | None = None,)
Explanation of parameters
Line 2: The
id_varsparameter specifies the column(s) or selector(s) to function as identifier variables. If not explicitly specified, the operation will include all columns that are not defined invalue_vars.Line 3: The
value_varsparameter defines the column(s) or selector(s) intended as value variables. If not explicitly specified, the operation will encompass all columns not mentioned inid_vars.Line 4: The
variable_nameparameter allows the assignment of a name to the variable column. The default is set tovariable.Line 5: The
value_nameparameter permits the assignment of a name to the value column. The default is set tovalue.
Code example
Now let’s take a look at the coding example to understand the DataFrame.melt() method:
import polars as plimport polars.selectors as cs# Create a sample DataFramedf = pl.DataFrame({"a": ["aa", "bb", "cc"],"b": [2, 4, 6],"c": [3, 6, 9],"d": [4 ,8 ,12]})# Melt the DataFramemelted_df = df.melt(id_vars="a", value_vars=cs.numeric())# Melting some columnsmelted_some= df.melt(id_vars="b", value_vars= ("c","d"))# Display the resultprint(melted_df)print ("melted some values are: ", melted_some)
Code explanation
Let’s take a look at the above code step-by-step:
Lines 1–2: We import the Polars library and its selectors module. The
plalias is commonly used for Polars, andcsis used for selectors.Lines 5–0: We create a DataFrame named
dfwith columnsa,b,c, andd, each containing sample data.Lines 13: We apply the
melt()method to the DataFramedf. It specifies:id_vars="a": Theacolumn is set as the identifier variable.value_vars=cs.numeric(): All numeric columns that are not inid_varswill be melted.
Line 15: We apply the
melt()method to the DataFramedf. It specifies:id_vars="b": Thebcolumn is set as the identifier variable. This means that the values in thebcolumn will be retained as is in the resulting melted DataFrame.value_vars=("c", "d"): The columnscanddare specified as the columns to be melted. This implies that the values in these columns will be unpivoted, and a newvariablecolumn will be created to hold the column names (candd).
Line 18: We print the melted DataFrame for all numeric columns.
Line 19: We print the melted DataFrame for specific columns
canddwithbas the identifier variable.
Output
The output of the above code example shows a new DataFrame returned by the DataFrame.melt() method containing the melted data.
Wrap up
The DataFrame.melt() method in Polars is particularly useful for data manipulation tasks where a more compact representation of the data is desired, facilitating downstream analysis and visualization. The method provides flexibility through parameters such as specifying identifier and value columns, as well as customizable names for the resulting variable and value columns. Ultimately, the DataFrame.melt() method in Polars aids in the efficient transformation of data, supporting a wide range of data analysis workflows.
Free Resources