Filtering

Learn how to use where() and mask() for filtering and replacing data based on boolean indexing.

Recap of boolean indexing

Before we dive into filtering numerical values with the pandas methods of where() and mask(), it’ll be good to revisit the concept of boolean indexing. Boolean indexing is the technique of selecting data from a DataFrame based on an array of True/False values so that only the elements from the original data, where the corresponding element in the mask is True, are selected.

This array of True/False values is known as a boolean mask and has the same shape as the original data. The True or False values in the boolean mask are determined by the specific criteria we define. For example, we have the following subset of the credit card dataset, and we set a condition for numerical values to be less than 40:

Get hands-on with 1200+ tech skills courses.