Ordinal encoding in Python
Ordinal encoding is a technique used to convert
Consider a scenario where the categorical variable represents colors such as red, green, and blue. These categories can be mapped to numerical values like 1, 2, and 3 using ordinal encoding.
Colors | Encoded colors |
Red | 1 |
Green | 2 |
Blue | 3 |
Steps
To execute ordinal encoding in Python, the following steps are typically followed.
1. Installation
The first step is to install the scikit-learn library to use the OrdinalEncoder package as follows:
pip install -U scikit-learn
The -U flag is used to upgrade a package to the latest version available.
2. Importing the libraries
The next step is to import the required libraries.
import pandas as pdfrom sklearn.preprocessing import OrdinalEncoder
3. Creating a simple DataFrame
In this step, we create a simple DataFrame, as shown below. We can also import our dataset.
colors = {'Colors': ['Red', 'Green', 'Blue']}df = pd.DataFrame(colors)
4. Initializing the OrdinalEncoder class
We then initialize an instance of the OrdinalEncoder class and store it in the encoder variable as follows:
encoder = OrdinalEncoder()
5. Transforming the categorical data
In this step, we pass the Colors column to the fit_transform function to perform ordinal encoding, as shown below:
df['Colors_Encoded'] = encoder.fit_transform(df[['Colors']])
Note: The
OrdinalEncoderpackage can encode multiple columns simultaneously.
Example
The following code shows how we can use the OrdinalEncoder package in Python:
# Import necessary librariesimport pandas as pdfrom sklearn.preprocessing import OrdinalEncoder# Create a sample DataFramecolors = {'Colors': ['Red', 'Green', 'Blue']}df = pd.DataFrame(colors)# Print the original DataFrameprint("Original DataFrame Before Ordinal Encoding:")print(df)# Initialize the OrdinalEncoderencoder = OrdinalEncoder()# Fit and transform the 'Colors' column using ordinal encodingdf['Colors_Encoded'] = encoder.fit_transform(df[['Colors']])# Display the DataFrame with the encoded columnprint("\nDataFrame after Ordinal Encoding:")print(df)
Explanation
Lines 2–3: We import the required libraries, including
pandasfor data manipulation and theOrdinalEncoderpackage from thescikit-learnlibrary for ordinal encoding.Line 6: We create a sample DataFrame (
df) with a categorical column namedColors.Line 14: We initialize the
OrdinalEncoderclass.Line 17: We fit and transform the
Colorscolumn using the ordinal encoding. The transformed values are stored in a new column namedColors_Encoded.Lines 20–21: We display the DataFrame after applying ordinal encoding to observe the changes.
Free Resources