Search⌘ K
AI Features

Other Features and Properties

Discover important pandas features and properties for handling categorical data. Learn how to combine categories, preserve category dtypes in merges, set valid category values, and export categorical data while maintaining integrity. Understand how missing values are represented and how to efficiently manage categorical columns for effective data analysis.

Introduction

Having covered the essential operations and methods around categorical data, let's wrap up this chapter by going over some other noteworthy features and properties.

Unioning categories

To combine multiple categorical variables with different categories, we must first create a common set of categories for them. We can do so with the union_categoricals() function, which generates a union of the categories being combined. It works with Series, Categorical, and CategoricalIndex, and the output of the union operation is a Categorical object.

When we refer to data types, there are two concepts at play:

  • DataFrame columns (or Series objects) can have different dtypes. When dealing with a categorical variable, we can encode it with dtype='category'. For such situations, we’ll use the term dtype.

  • For categorical variables that are already encoded with dtype='category', the ...