One-hot encoding in Python

Most of the existing machine learning algorithms cannot be executed on categorical data. Instead, the categorical data needs to first be converted to numerical data. One-hot encoding is one of the techniques used to perform this conversion. This method is mostly used when deep learning techniques are to be applied to sequential classification problems.

from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder
### Categorical data to be converted to numeric data
colors = (["red", "green", "yellow", "red", "blue"])
### integer mapping using LabelEncoder
label_encoder = LabelEncoder()
integer_encoded = label_encoder.fit_transform(colors)
print(integer_encoded)
integer_encoded = integer_encoded.reshape(len(integer_encoded), 1)
### One hot encoding
onehot_encoder = OneHotEncoder(sparse=False)
onehot_encoded = onehot_encoder.fit_transform(integer_encoded)
print(onehot_encoded)

One-hot encoding in Python

Manual one-hot encoding

One-hot encoding using `scikit-learn`

One-hot encoding in Python

Manual one-hot encoding

One-hot encoding using scikit-learn

One-hot encoding using `scikit-learn`