Ordinal encoding is a technique used to convert
Consider a scenario where the categorical variable represents colors such as red, green, and blue. These categories can be mapped to numerical values like 1, 2, and 3 using ordinal encoding.
Colors | Encoded colors |
Red | 1 |
Green | 2 |
Blue | 3 |
To execute ordinal encoding in Python, the following steps are typically followed.
The first step is to install the scikit-learn
library to use the OrdinalEncoder
package as follows:
pip install -U scikit-learn
The -U
flag is used to upgrade a package to the latest version available.
The next step is to import the required libraries.
import pandas as pdfrom sklearn.preprocessing import OrdinalEncoder
In this step, we create a simple DataFrame, as shown below. We can also import our dataset.
colors = {'Colors': ['Red', 'Green', 'Blue']}df = pd.DataFrame(colors)
OrdinalEncoder
classWe then initialize an instance of the OrdinalEncoder
class and store it in the encoder
variable as follows:
encoder = OrdinalEncoder()
In this step, we pass the Colors
column to the fit_transform
function to perform ordinal encoding, as shown below:
df['Colors_Encoded'] = encoder.fit_transform(df[['Colors']])
Note: The
OrdinalEncoder
package can encode multiple columns simultaneously.
The following code shows how we can use the OrdinalEncoder
package in Python:
# Import necessary librariesimport pandas as pdfrom sklearn.preprocessing import OrdinalEncoder# Create a sample DataFramecolors = {'Colors': ['Red', 'Green', 'Blue']}df = pd.DataFrame(colors)# Print the original DataFrameprint("Original DataFrame Before Ordinal Encoding:")print(df)# Initialize the OrdinalEncoderencoder = OrdinalEncoder()# Fit and transform the 'Colors' column using ordinal encodingdf['Colors_Encoded'] = encoder.fit_transform(df[['Colors']])# Display the DataFrame with the encoded columnprint("\nDataFrame after Ordinal Encoding:")print(df)
Lines 2–3: We import the required libraries, including pandas
for data manipulation and the OrdinalEncoder
package from the scikit-learn
library for ordinal encoding.
Line 6: We create a sample DataFrame (df
) with a categorical column named Colors.
Line 14: We initialize the OrdinalEncoder
class.
Line 17: We fit and transform the Colors
column using the ordinal encoding. The transformed values are stored in a new column named Colors_Encoded
.
Lines 20–21: We display the DataFrame after applying ordinal encoding to observe the changes.