What is boolean masking on NumPy arrays in Python?
Overview
The NumPy library in Python is a popular library for working with arrays. Boolean masking, also called boolean indexing, is a feature in Python NumPy that allows for the filtering of values in numpy arrays.
There are two main ways to carry out boolean masking:
- Method one: Returning the result array.
- Method two: Returning a boolean array.
Method one: Returning the result array
The first method returns an array with the required results. In this method, we pass a condition in the indexing brackets, [], of an array. The condition can be any comparison, like arr > 5, for the array arr.
Syntax
The code snippet given below shows us how we can use this method.
arr[arr > 5]
Parameter values
arr: This is the array that we are querying.- The condition
arr > 5is the criterion with which values in thearrarray will be filtered.
Return value
This method returns a NumPy array, ndarray, with values that satisfy the given condition. The line in the example given above will return all the values in arr that are greater than 5.
Example
Let's try out this method in the following example:
# importing NumPyimport numpy as np# Creating a NumPy arrayarr = np.arange(15)# Printing our array to observeprint(arr)# Using boolean masking to filter elements greater than or equal to 8print(arr[arr >= 8])# Using boolean masking to filter elements equal to 12print(arr[arr == 12])
Explanation
- Line 2: We import the
numpylibrary. - Line 4: We create the
numpyarray that contains integers from 1 to 15 using thearange()function, and then store it in thearrarray. - Line 6: We print the
arrarray. - Line 8: We use boolean masking to return all the elements in
arrthat are greater than or equal to eight. Then, we print the resulting array. - Line 10: We use boolean masking to return all the elements in
arrthat are equal to12. Then, we print the resulting array.
Method two: Returning a boolean array
The second method returns a boolean array that has the same size as the array it represents. A boolean array only contains the boolean values of either True or False. This boolean array is also called a mask array, or simply a mask. We'll discuss boolean arrays in more detail in the "Return value" section.
Syntax
The code snippet given below shows us how to use this method:
mask = arr > 5
Parameter values
arr: This is the array that we are querying.arr > 5is our condition.
Return value
The line in the code snippet given above will:
- Return an array with the same size and dimensions as
arr. This array will only contain the valuesTrueandFalse. All theTruevalues represent elements in the same position inarrthat satisfy our condition, and all theFalsevalues represent elements in the same position inarrthat do not satisfy our condition. - Store this boolean array in a
maskarray.
The mask array can be passed in the index brackets of arr to return the values that satisfy our condition. We will see how this works in our coding example.
Example
Let's try out this method in the following example:
# importing NumPyimport numpy as np# Creating a NumPy arrayarr = np.array([[ 0, 9, 0],[ 0, 7, 8],[ 6, 0, 1]])# Printing our array to observeprint(arr)# Creating a mask arraymask = arr > 5# Printing the mask arrayprint(mask)# Printing the filtered array using both methodsprint(arr[mask])print(arr[arr > 5])
Explanation
- Line 2: We import the
numpylibrary. - Lines 4–6 : We create a
numpyarray that contains some integers and store it in thearrarray.
- Line 8: We print the
arrarray. - Line 10: We use boolean masking to return a boolean array, which represents the corresponding elements in
arrthat are greater than5. Then, we store this boolean array in amaskarray. - Line 12: We print the
maskarray. - Line 14: We use the
maskarray to filter the elements inarrthat are greater than5. - Line 15: We use method one to filter the elements in
arrthat are greater than5.
Note: The results from both the methods are the same.
Free Resources