Table of Contents
What is one hot encoding?Why use one hot encoding?How to read this decision treeHandling high-cardinality categorical featuresWhat does “high cardinality” mean?Why one-hot encoding becomes problematicSmall vs large category exampleBetter alternatives for high-cardinality featuresTarget encodingFrequency/count encodingHash encodingEmbedding layersWhen should you use each approach?Practical recommendationSparse matrices and memory efficiencyWhy one-hot encoding wastes memoryWhat is a sparse matrix?How Sklearn handles sparse outputDense vs sparse intuitionExample 1: Dense output with PandasOutput typeExample 2: Sparse output with SklearnOutput typeOptional memory comparisonWhen should you care?Practical recommendationHow to convert categorical data to numerical dataWhat is the dummy variable trap?One hot encoding with PandasOne hot encoding with SklearnComparing Pandas, Sklearn, and category_encodersExample dataset1. Pandas get_dummies()example2. Sklearn OneHotEncoderexample3. category_encoders TargetEncoder exampleWhen should you use each?Next steps for your learningContinue reading about artificial intelligenceOne-hot encoding in PyTorch and TensorFlowOne-hot encoding in TensorFlowTensorFlow exampleOutputShape explanationCommon TensorFlow use casesOne-hot encoding in PyTorchPyTorch exampleOutputShape explanationCommon PyTorch use casesOne-hot encoding vs embeddings in deep learningPractical use casesImportant warningNext steps for your learningContinue reading about artificial intelligence
Data Science in 5 Minutes: What is One Hot Encoding?

Data Science in 5 Minutes: What is One Hot Encoding?

Learn what one-hot encoding is, when to use it, how it compares to other encoding techniques, and how to implement it with Pandas and Scikit-learn to prepare categorical data for machine learning models.

15 mins read
Jun 01, 2026
Share
editor-page-cover

If you’re in the field of data science, you’ve probably heard the term “one hot encoding”. Even the Sklearn documentation tells you to “encode categorical integer features using a one-hot scheme”. But, what is one hot encoding, and why do we use it?

Most machine learning tutorials and tools require you to prepare data before it can be fit to a particular ML model. One hot encoding is a process of converting categorical data variables so they can be provided to machine learning algorithms to improve predictions. One hot encoding is a crucial part of feature engineering for machine learning.

In this guide, we will introduce you to one hot encoding and show you when to use it in your ML models. We’ll provide some real-world examples with Sklearn and Pandas.

This tutorial at a glance:


Start mastering feature engineering for ML with our hands-on course today.

Cover
Feature Engineering for Machine Learning

Feature engineering is a crucial stage in any machine learning project. It allows you to use data to define features that enable machine learning algorithms to work properly. In this course, you will learn the techniques that will help you create new features from existing features. You’ll start by diving into label encoding which is crucial for converting categorical features into numerical. You’ll also learn about other various types of encoding such as: one-hot, count, and mean, all of which are important for feature engineering. In the remaining chapters, you’ll learn about feature interaction and datetime features. In all, this course will show you the many different ways you can create features from existing ones.

30mins
Advanced
10 Playgrounds
1 Quiz

What is one hot encoding?#

Categorical data refers to variables that are made up of label values, for example, a “color” variable could have the values “red,” “blue,” and “green.” Think of values like different categories that sometimes have a natural ordering to them.

Some machine learning algorithms can work directly with categorical data depending on implementation, such as a decision tree, but most require any inputs or outputs variables to be a number, or numeric in value. This means that any categorical data must be mapped to integers.

One hot encoding is one method of converting data to prepare it for an algorithm and get a better prediction. With one-hot, we convert each categorical value into a new categorical column and assign a binary value of 1 or 0 to those columns. Each integer value is represented as a binary vector. All the values are zero, and the index is marked with a 1.

Take a look at this chart for a better understanding:

widget

Let’s apply this to an example. Say we have the values red and blue. With one-hot, we would assign red with a numeric value of 0 and blue with a numeric value of 1.

It’s crucial to be consistent when we use these values. This makes it possible to invert our encoding at a later point to get our original categorical back.

Once we assign numeric values, we create a binary vector that represents our numerical values. In this case, our vector will have 2 as its length since we have 2 values. Thus, the red value can be represented with the binary vector [1,0], and the blue value will be represented as [0,1].


Why use one hot encoding?#

One hot encoding is useful for data that has no relationship to each other. Machine learning algorithms treat the order of numbers as an attribute of significance. In other words, they will read a higher number as better or more important than a lower number.

While this is helpful for some ordinal situations, some input data does not have any ranking for category values, and this can lead to issues with predictions and poor performance. That’s when one hot encoding saves the day.

One hot encoding makes our training data more useful and expressive, and it can be rescaled easily. By using numeric values, we more easily determine a probability for our values. In particular, one hot encoding is used for our output values, since it provides more nuanced predictions than single labels.

widget

How to read this decision tree#

  • Use one-hot encoding when your categories are nominal, unordered, and have a small number of unique values, such as red, blue, and green.

  • Use ordinal encoding when the categories have a meaningful order, such as low, medium, and high.

  • Use label encoding carefully. It can work well with tree-based models, but for unordered categories, it may accidentally imply an order that does not exist.

  • Use target encoding when you have high-cardinality features, such as ZIP codes, product IDs, or user segments. It can reduce dimensionality, but you should apply it carefully to avoid data leakage.

The encoding choice is a feature engineering decision, and it can directly affect model accuracy.

Handling high-cardinality categorical features#

One-hot encoding works well when a feature has only a few categories. But in real-world machine learning systems, you’ll often encounter features with hundreds or even thousands of unique values. This is called high cardinality, and it can create serious performance and scalability problems if you’re not careful.

What does “high cardinality” mean?#

A categorical feature has high cardinality when it contains many unique categories—typically more than 50 or 100.

Common examples include:

  • Product IDs in e-commerce systems

  • City or ZIP code features

  • User IDs

  • Search queries or keywords

For example:

Feature: City
Unique values: 1,000 cities
One-hot encoding result:
→ 1,000 binary columns

That’s where problems start.

Why one-hot encoding becomes problematic#

With high-cardinality features, one-hot encoding can quickly become inefficient.

Here’s why:

  • It creates too many columns

  • Memory usage increases significantly

  • Most values become 0, creating sparse matrices

  • Training becomes slower

  • Some models struggle with extremely wide datasets

  • It can reduce generalization and hurt performance

For small datasets, this might be manageable. For production-scale ML systems, it often isn’t.

Small vs large category example#

A small categorical feature works well with one-hot encoding:

# 3 categories
Color = ["Red", "Blue", "Green"]
# One-hot encoded result
Red Blue Green
1 0 0
0 1 0
0 0 1

Now imagine a feature with 1,000 product IDs:

# 1,000 unique products
Product_ID = ["P101", "P102", ..., "P1000"]
# One-hot encoding would create:
Product_P101
Product_P102
...
Product_P1000

That means 1,000 separate columns for just one feature.

Better alternatives for high-cardinality features#

Instead of blindly applying one-hot encoding, you can use more scalable encoding techniques.

Target encoding#

Target encoding replaces each category with the average target value associated with it.

Example:

  • City A → average purchase value = 120

  • City B → average purchase value = 85

This works especially well for:

  • Tree-based models

  • Large tabular datasets

  • Kaggle-style ML problems

Warning: If done incorrectly, target encoding can cause data leakage because it uses information from the target variable.

Frequency/count encoding#

This technique replaces categories with how often they appear.

Example:

  • “New York” → 12,500

  • “Chicago” → 8,200

Useful when:

  • Category frequency itself carries meaning

  • You want a simple and lightweight solution

Hash encoding#

Hash encoding maps categories into a fixed number of columns using a hash function.

Benefits:

  • Controls feature size

  • Works well for very large datasets

  • Useful in streaming systems and NLP pipelines

Trade-off:

  • Different categories can occasionally collide into the same bucket

Embedding layers#

Deep learning models often use embeddings instead of one-hot encoding.

Instead of creating thousands of sparse columns, embeddings learn dense numerical representations for categories.

Common use cases:

  • Recommendation systems

  • NLP models

  • Large-scale deep learning pipelines

This is how systems like YouTube, Netflix, and modern language models handle massive categorical spaces efficiently.

When should you use each approach?#

  • One-hot encoding → Small category sets

  • Target encoding → Tree-based models and tabular ML

  • Frequency encoding → Lightweight preprocessing

  • Hash encoding → Extremely large feature spaces

  • Embeddings → Deep learning and recommendation systems

Practical recommendation#

In practice, there’s no single “best” encoding strategy.

A good rule of thumb is:

  • Use one-hot encoding for low-cardinality features

  • Use alternative encodings when category counts become large

  • Always validate performance using cross-validation

The right encoding strategy can significantly improve both model scalability and feature engineering quality.

Sparse matrices and memory efficiency#

One-hot encoding is simple and powerful, but it comes with a hidden cost: memory usage. As the number of categories grows, the encoded dataset can become extremely large because most values in the matrix are zeros. That’s where sparse matrices become important.

Why one-hot encoding wastes memory#

When you one-hot encode categorical features, each category becomes its own binary column.

For example:

Color = ["Red", "Blue", "Green"]
One-hot encoded:
Red Blue Green
1 0 0
0 1 0
0 0 1

This works fine for small category sets. But imagine a feature with 10,000 unique product IDs.

You would create:

  • 10,000 columns

  • Mostly zeros in every row

  • A very large dense matrix

That means you’re storing huge amounts of unnecessary data.

What is a sparse matrix?#

A sparse matrix stores only the non-zero values instead of storing every single 0.

Conceptually:

Dense representation:
[0, 0, 0, 1, 0, 0]
Sparse representation:
(index=3, value=1)

This is much more memory efficient because the matrix avoids storing thousands or millions of zeros.

Many machine learning libraries automatically use sparse representations internally for this reason.

How Sklearn handles sparse output#

OneHotEncoder in Scikit-learn returns sparse matrices by default.

Modern versions use:

sparse_output=True

Older versions used:

sparse=True

The output is usually a CSR matrix (Compressed Sparse Row matrix), which is optimized for efficient storage and fast operations.

Dense vs sparse intuition#

A small feature with 10 categories is usually manageable with dense storage.

But with:

  • 10,000 categories

  • Millions of rows

  • Multiple encoded features

…the dense representation can consume massive amounts of memory very quickly.

Sparse matrices solve this by storing only the meaningful values.

Example 1: Dense output with Pandas#

import pandas as pd
df = pd.DataFrame({
"city": ["Lahore", "Karachi", "Islamabad", "Lahore"]
})
dense_encoded = pd.get_dummies(df["city"])
print(dense_encoded)
print("\nShape:", dense_encoded.shape)
print("Type:", type(dense_encoded))

Output type#

<class 'pandas.core.frame.DataFrame'>

This creates a normal dense DataFrame where all values—including zeros—are stored in memory.

Example 2: Sparse output with Sklearn#

from sklearn.preprocessing import OneHotEncoder
import pandas as pd
df = pd.DataFrame({
"city": ["Lahore", "Karachi", "Islamabad", "Lahore"]
})
encoder = OneHotEncoder(sparse_output=True)
sparse_encoded = encoder.fit_transform(df[["city"]])
print("Shape:", sparse_encoded.shape)
print("Type:", type(sparse_encoded))

Output type#

<class 'scipy.sparse._csr.csr_matrix'>

Instead of storing every zero explicitly, the matrix stores only the positions of non-zero values.

Optional memory comparison#

import sys
dense_size = sys.getsizeof(dense_encoded)
sparse_size = sys.getsizeof(sparse_encoded)
print("Dense size:", dense_size)
print("Sparse size:", sparse_size)

On larger datasets, the memory difference becomes dramatic.

When should you care?#

  • Large datasets

  • NLP systems (bag-of-words, TF-IDF)

  • Recommendation systems

  • High-cardinality categorical features

  • Production ML pipelines

Some machine learning algorithms also work better with sparse input than others, especially linear models and certain tree-based approaches.

Practical recommendation#

Use sparse matrices whenever your feature dimensionality becomes large. They improve both memory efficiency and scalability, which becomes critical in real-world machine learning systems.


How to convert categorical data to numerical data#

Manually converting our data to numerical values includes two basic steps:

  • Integer encoding
  • One hot encoding

For the first step, we need to assign each category value with an integer, or numeric, value. If we had the values red, yellow, and blue, we could assign them 1, 2, and 3 respectively.

When dealing with categorical variables that have no order or relationship, we need to take this one step further. Step two involves applying one-hot encoding to the integers we just assigned. To do this, we remove the integer encoded variable and add a binary variable for each unique variable.

Above, we had three categories, or colors, so we use three binary variables. We place the value 1 as the binary variable for each color and the value 0 for the other two colors.

red,	yellow,	 blue
1,		0,		0
0,		1,		0
0,		0,		1

Note: In many other fields, binary variables are referred to as dummy variables.

Start mastering feature engineering for ML with our hands-on course today.

Cover
Feature Engineering for Machine Learning

Feature engineering is a crucial stage in any machine learning project. It allows you to use data to define features that enable machine learning algorithms to work properly. In this course, you will learn the techniques that will help you create new features from existing features. You’ll start by diving into label encoding which is crucial for converting categorical features into numerical. You’ll also learn about other various types of encoding such as: one-hot, count, and mean, all of which are important for feature engineering. In the remaining chapters, you’ll learn about feature interaction and datetime features. In all, this course will show you the many different ways you can create features from existing ones.

30mins
Advanced
10 Playgrounds
1 Quiz

What is the dummy variable trap?#

One-hot encoding is a powerful technique, but it can sometimes introduce an issue known as the dummy variable trap. This occurs when all encoded categories are included in a model, creating perfect multicollinearity because one category can always be inferred from the others.

Category

Red

Blue

Green

Red

1

0

0

Blue

0

1

0

Green

0

0

1

In the example above, knowing the values of two columns automatically reveals the value of the third. For linear regression models, this redundancy can make coefficient estimation unstable and harder to interpret.

To avoid this issue, many machine learning practitioners drop one encoded column and treat it as the baseline category. Libraries such as Pandas and Scikit-learn provide options to automate this behavior. For example, pd.get_dummies(drop_first=True) or OneHotEncoder(drop='first') can remove the redundant feature automatically.

It's important to note that the dummy variable trap primarily affects linear models. Tree-based algorithms such as Random Forests and Gradient Boosting are generally unaffected because they do not rely on matrix inversion when learning relationships.


One hot encoding with Pandas#

We don’t have to one hot encode manually. Many data science tools offer easy ways to encode your data. The Python library Pandas provides a function called get_dummies to enable one-hot encoding.

df_new = pd.get_dummies(df, columns=["col1"], prefix="Planet")

Let’s see this in action.

Python
import pandas as pd
df = pd.DataFrame({"col1": ["Sun", "Sun", "Moon", "Earth", "Moon", "Venus"]})
print("The original data")
print(df)
print("*" * 30)
df_new = pd.get_dummies(df, columns=["col1"], prefix="Planet")
print("The transform data using get_dummies")
print(df_new)
  • Line 7 shows that we’re using get_dummies to do one-hot encoding for a pandas DataFrame object. The parameter prefix indicates the prefix of the new column name.
  • Line 9 shows us our output.

Let’s apply this to a practical example. Say we have the following dataset.

import pandas as pd
 
ids = [11, 22, 33, 44, 55, 66, 77]
countries = ['Seattle', 'London', 'Lahore', 'Berlin', 'Abuja']
 
df = pd.DataFrame(list(zip(ids, countries)),
                  columns=['Ids', 'Cities'])

Here we have a Pandas dataframe called df with two lists: ids and Cities. Let’s call the head() to get this result:

Ids Cities
0 11 Seattle
1 22 London
2 33 Lahore
3 44 Berlin
4 55 Abuja

We see here that the Cities column contains our categorical values: the names of our cities. We must convert them in our new column Cities using the get_dummies() function we discussed above.

y = pd.get_dummies(df.Countries, prefix='City')
print(y.head())

Here, we are passing the value City for the prefix attribute of the method get_dummies(). If we run the code now, we will print our encoded values:

Python
import pandas as pd
df = pd.DataFrame({"col1": ["Seattle", "London", "Lahore", "Berlin", "Abuja"]})
print("The original data")
print(df)
print("*" * 30)
df_new = pd.get_dummies(df, columns=["col1"], prefix="Cities")
print("The transform data using get_dummies")
print(df_new)

One hot encoding with Sklearn#

We can implement a similar functionality with Sklearn, which provides an object/function for one-hot encoding in the preprocessing module.

Python
import sklearn.preprocessing as preprocessing
import numpy as np
import pandas as pd
targets = np.array(["red", "green", "blue", "yellow", "pink",
"white"])
labelEnc = preprocessing.LabelEncoder()
new_target = labelEnc.fit_transform(targets)
onehotEnc = preprocessing.OneHotEncoder()
onehotEnc.fit(new_target.reshape(-1, 1))
targets_trans = onehotEnc.transform(new_target.reshape(-1, 1))
print("The original data")
print(targets)
print("The transform data using OneHotEncoder")
print(targets_trans.toarray())
  • We use LabelEncoder to convert the string to int on line 7 and line 8.
  • Line 9 creates our OneHotEncoder object.
  • Line 10 fits the original feature using fit().
  • Line 11 converts the original feature to the new feature using one-hot encoding.
  • You can see the new data from the output of line 15.

Note: In the newer version of sklearn, you don’t need to convert the string to int, as OneHotEncoder does this automatically.

Let’s see the OneHotEncoder class in action with another example. First, here’s how to import the class.

from sklearn.preprocessing import OneHotEncoder 

Like before, we first populate our list of unique values for the encoder.

Python
x = [[11, "Seattle"], [22, "London"], [33, "Lahore"], [44, "Berlin"], [55, "Abuja"]]
y = OneHotEncoder().fit_transform(x).toarray()
print(y)

When we print this, we get the following for our now encoded values:

[[1. 0. 0. 0. 0. 0. 0. 1.]
[0. 1. 0. 0. 0. 1. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 1.]
[0. 0. 0. 1. 0. 0. 1. 0.]
[0. 0. 0. 0. 1. 1. 0. 0.]]

Comparing Pandas, Sklearn, and category_encoders#

Python gives you several ways to encode categorical variables, but they are not all meant for the same workflow. pd.get_dummies() is great for quick exploration, sklearn.OneHotEncoder is better for production ML pipelines, and category_encoders is useful when you need more advanced encoding strategies.

Tool

Best For

Pros

Limitations

Handles train/test consistency?

Pipeline-friendly?

pd.get_dummies()

Quick analysis and notebooks

Simple, readable, easy to use

Can create train/test column mismatches

Not automatically

No

sklearn.OneHotEncoder

Production ML workflows

Works with Sklearn pipelines, handles unseen categories

Slightly more setup

Yes

Yes

category_encoders

Advanced feature engineering

Supports target encoding and high-cardinality features

Requires extra library and careful validation

Yes, when fitted properly

Yes

Example dataset#

import pandas as pd
df = pd.DataFrame({
"city": ["Lahore", "Karachi", "Lahore", "Islamabad"],
"product_category": ["Books", "Electronics", "Books", "Clothing"],
"purchase_amount": [1200, 4500, 1500, 3000]
})
print(df)

1. Pandas get_dummies()example#

encoded_df = pd.get_dummies(
df,
columns=["city", "product_category"]
)
print(encoded_df)

This is the fastest way to one-hot encode categories in a notebook. However, if your test data contains a new city or is missing a category from training, you may need to manually align columns.

2. Sklearn OneHotEncoderexample#

from sklearn.preprocessing import OneHotEncoder
X = df[["city", "product_category"]]
encoder = OneHotEncoder(handle_unknown="ignore", sparse_output=False)
encoded = encoder.fit_transform(X)
encoded_df = pd.DataFrame(
encoded,
columns=encoder.get_feature_names_out()
)
print(encoded_df)

OneHotEncoder is a better fit for machine learning workflows because you can fit it on training data and safely transform test data later. The handle_unknown="ignore" setting prevents errors when new categories appear.

3. category_encoders TargetEncoder example#

import pandas as pd
from category_encoders import TargetEncoder
X = df[["city", "product_category"]]
y = df["purchase_amount"]
encoder = TargetEncoder(cols=["city", "product_category"])
encoded = encoder.fit_transform(X, y)
print(encoded)

Target encoding replaces each category with a value based on the target variable. This can be useful for high-cardinality features like city names, product IDs, or user segments.Warning: Target encoding can cause data leakage if you apply it incorrectly. Always fit encoders on training data only and validate with cross-validation.

When should you use each?#

  • Use pd.get_dummies() when you’re doing quick analysis, exploring data, or building a simple notebook example.

  • Use sklearn.OneHotEncoder when you’re building a real ML pipeline and need consistent behavior across training and test data.

  • Use category_encoders when one-hot encoding creates too many columns or when you need advanced techniques like target encoding, count encoding, or hashing.

In practice, start simple with one-hot encoding, then move to advanced encoders when your dataset or model needs it.

Next steps for your learning#

Congrats on making it to the end! You should now have a good idea what one hot encoding does and how to implement it in Python. There is still a lot to learn to master machine learning feature engineering. Your next steps are:

  • One hot with Numpy
  • Count encoding
  • Mean encoding
  • Label encoding
  • Weight of evidence encoding

To get introduce to these, check out Educative’s mini course Feature Engineering for Machine Learning. You’ll learn the techniques to create new ML features from existing features. You’ll start by diving into label encoding which is crucial for converting categorical features into numerical. In the remaining chapters, you’ll learn about feature interaction and datetime features.

Happy learning!


Continue reading about artificial intelligence#

When we print this, we get the following for our now encoded values:

[[1. 0. 0. 0. 0. 0. 0. 1.]
[0. 1. 0. 0. 0. 1. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 1.]
[0. 0. 0. 1. 0. 0. 1. 0.]
[0. 0. 0. 0. 1. 1. 0. 0.]]

One-hot encoding in PyTorch and TensorFlow#

In deep learning, one-hot encoding is commonly used to represent categorical labels in a numerical format that neural networks can understand. You’ll see it frequently in classification tasks where models predict one class out of many possible categories.

Unlike traditional ML preprocessing pipelines, deep learning frameworks often perform one-hot encoding directly on tensors during training.

One-hot encoding in TensorFlow#

TensorFlow provides the tf.one_hot() function for converting integer labels into one-hot encoded tensors.

This is especially useful in:

  • Multi-class classification

  • Image classification labels

  • NLP token processing

TensorFlow example#

import tensorflow as tf# Integer labelslabels = [0, 2, 1]# One-hot encode with 3 classesencoded = tf.one_hot(labels, depth=3)print(encoded)print("Shape:", encoded.shape)

Output#

tf.Tensor([[1. 0. 0.][0. 0. 1.][0. 1. 0.]], shape=(3, 3), dtype=float32)

Shape explanation#

  • Input shape: (3,)

  • Output shape: (3, 3)

Each label becomes a vector of length 3, where:

  • 1 marks the correct class

  • 0 marks all other classes

For example:

  • Label 2[0, 0, 1]

Common TensorFlow use cases#

  • Image classification

  • Multi-class neural networks

  • Token representation in NLP pipelines

  • Recommendation systems

One-hot encoding in PyTorch#

PyTorch provides torch.nn.functional.one_hot() for the same purpose.

The idea is identical:

  • Integer labels are converted into categorical vectors

  • Each class gets its own position in the vector

PyTorch example#

import torchimport torch.nn.functional as F# Integer labelslabels = torch.tensor([0, 2, 1])# One-hot encodingencoded = F.one_hot(labels, num_classes=3)print(encoded)print("Shape:", encoded.shape)

Output#

tensor([[1, 0, 0],[0, 0, 1],[0, 1, 0]])

Shape explanation#

  • Input tensor shape: (3,)

  • Output tensor shape: (3, 3)

Each row represents one encoded class label.

Common PyTorch use cases#

  • Deep learning classifiers

  • Custom loss functions

  • NLP pipelines

  • Reinforcement learning models

One-hot encoding vs embeddings in deep learning#

One-hot encoding works well when the number of categories is small. But for large vocabularies or high-cardinality features, it becomes inefficient because the vectors grow very large and sparse.

That’s why modern deep learning systems often use embeddings instead.

Embeddings:

  • Learn dense numerical representations

  • Reduce dimensionality

  • Improve scalability and memory efficiency

This is especially important in:

  • NLP systems

  • Recommendation engines

  • Transformer models and modern AI architectures

Practical use cases#

You’ll commonly see one-hot encoding in:

  • Image classification labels (cat, dog, car)

  • NLP token encoding

  • Recommendation systems

  • Multi-class prediction tasks

Important warning#

One-hot encoding very large vocabularies can become memory-intensive because every category creates a new dimension. For large-scale deep learning systems, embeddings are usually the preferred solution.


Next steps for your learning#

Congrats on making it to the end! You should now have a good idea what one hot encoding does and how to implement it in Python. There is still a lot to learn to master machine learning feature engineering. Your next steps are:

  • One hot with Numpy
  • Count encoding
  • Mean encoding
  • Label encoding
  • Weight of evidence encoding

To get introduce to these, check out Educative’s mini course Feature Engineering for Machine Learning. You’ll learn the techniques to create new ML features from existing features. You’ll start by diving into label encoding which is crucial for converting categorical features into numerical. In the remaining chapters, you’ll learn about feature interaction and datetime features.

Happy learning!


Continue reading about artificial intelligence#

Frequently Asked Questions

What is one hot encoding?

One hot encoding is a powerful technique for handling categorical data, but it can also increase dimensionality, sparsity, and the risk of overfitting.


Written By:
Amanda Fawcett