Factorize and Cross-Tabulate

Learn how to factorize categorical variables and perform cross-tabulation on the categories.

Factorizing categorical data

So far we’ve seen how the astype() method converts a DataFrame column into the category data type, while maintaining the categories at their original values. If instead we want to encode the column and obtain a numeric representation of the categories, we can use the factorize() function.

For example, we can apply the factorize() method on the Ethnicity column of the credit card dataset to retrieve a numerical representation of the different ethnicities.

Get hands-on with 1200+ tech skills courses.