Get the subset of the columns of a dataframe based on dtype
Overview
In Pandas, the select_dtypes() function is used to return the subset of the columns of a dataframe by specifying the data types.
Some datatypes or dtype in Python include float64, bool, int64, and more.
Syntax
DataFrame.select_dtypes(include=None, exclude=None)
Syntax for the select_dtypes() function in Pandas
Parameter value
This function takes the following parameter values:
include: This is used to specify the datatype to be included or returned in the output result.exclude: This is used to specify the datatype to be excluded in the output result.
Note: At least one of the parameters,
includeorexclude, must be passed to theselect_dtypes()function.
Return value
This function returns the subset of the given dataframe having the datatypes specified in include and excluding the datatypes in exclude.
Example
import pandas as pd# creating a DataFramedf = pd.DataFrame({'INTEGERS': [1, 0] * 3,'BOOLEAN': [True, False] * 3,'FLOAT': [1.0, 2.0] * 3})# printing the DataFrameprint(df)# implementing the select_dtypes() function to include boolean valuesprint(df.select_dtypes(include="bool"))# implementing the select_dtypes() function to exclude boolean valuesprint(df.select_dtypes(exclude="bool"))
Explanation
- Line 1: We import the
pandasmodule. - Line 4–6: We create a dataframe,
df. - Line 9: We print the dataframe,
df. - Line 12: We use the
select_dtypes()function to include boolean values. We print the results to the console. - Line 15: We use the
select_dtypes()function to exclude boolean values. We print the results to the console.