Related Tags

numpy
python
boolean indexing

# What is boolean masking on NumPy arrays in Python?

Educative Team

### Overview

The NumPy library in Python is a popular library for working with arrays. Boolean masking, also called boolean indexing, is a feature in Python NumPy that allows for the filtering of values in numpy arrays.

There are two main ways to carry out boolean masking:

• Method one: Returning the result array.
• Method two: Returning a boolean array.

### Method one: Returning the result array

The first method returns an array with the required results. In this method, we pass a condition in the indexing brackets, [], of an array. The condition can be any comparison, like arr > 5, for the array arr.

### Syntax

The code snippet given below shows us how we can use this method.

arr[arr > 5]

### Parameter values

• arr: This is the array that we are querying.
• The condition arr > 5 is the criterion with which values in the arr array will be filtered.

### Return value

This method returns a NumPy array, ndarray, with values that satisfy the given condition. The line in the example given above will return all the values in arr that are greater than 5.

### Example

Let's try out this method in the following example:

# importing NumPy
import numpy as np
# Creating a NumPy array
arr = np.arange(15)
# Printing our array to observe
print(arr)
# Using boolean masking to filter elements greater than or equal to 8
print(arr[arr >= 8])
# Using boolean masking to filter elements equal to 12
print(arr[arr == 12])
Apply boolean masking through indexing brackets

### Explanation

• Line 2: We import the numpy library.
• Line 4: We create the numpy array that contains integers from 1 to 15 using the arange() function, and then store it in the arr array.
• Line 6: We print the arr array.
• Line 8: We use boolean masking to return all the elements in arr that are greater than or equal to eight. Then, we print the resulting array.
• Line 10: We use boolean masking to return all the elements in arr that are equal to 12. Then, we print the resulting array.

### Method two: Returning a boolean array

The second method returns a boolean array that has the same size as the array it represents. A boolean array only contains the boolean values of either True or False. This boolean array is also called a mask array, or simply a mask. We'll discuss boolean arrays in more detail in the "Return value" section.

### Syntax

The code snippet given below shows us how to use this method:

mask = arr > 5

### Parameter values

• arr: This is the array that we are querying.
• arr > 5 is our condition.

### Return value

The line in the code snippet given above will:

• Return an array with the same size and dimensions as arr. This array will only contain the values True and False. All the True values represent elements in the same position in arr that satisfy our condition, and all the False values represent elements in the same position in arr that do not satisfy our condition.
• Store this boolean array in a mask array.

The mask array can be passed in the index brackets of arr to return the values that satisfy our condition. We will see how this works in our coding example.

### Example

Let's try out this method in the following example:

# importing NumPy
import numpy as np
# Creating a NumPy array
arr = np.array([[ 0,  9,  0],
[ 0,  7,  8],
[ 6,  0,  1]])
# Printing our array to observe
print(arr)
# Printing the filtered array using both methods
print(arr[arr > 5])

### Explanation

• Line 2: We import the numpy library.
• Lines 4–6 : We create a numpy array that contains some integers and store it in the arr array.
• Line 8: We print the arr array.
• Line 10: We use boolean masking to return a boolean array, which represents the corresponding elements in arr that are greater than 5. Then, we store this boolean array in a mask array.
• Line 12: We print the mask array.
• Line 14: We use the mask array to filter the elements in arr that are greater than 5.
• Line 15: We use method one to filter the elements in arr that are greater than 5.

Note: The results from both the methods are the same.

RELATED TAGS