How to return a subset of a DataFrame's columns based on dtypes
Overview
The select_dtypes() function in pandas returns a subset of a given DataFrame's columns based on the column data types.
Syntax
The select_dtypes() function's syntax is shown below:
DataFrame.select_dtypes(include=None, exclude=None)
Syntax for the select_dtypes() function
Parameter value
The select_dtypes() function takes any of the two parameter values: include and exclude (at least one of them must be supplied). These represent the selection of the data type(s) to be included or excluded.
Return value
The select_dtypes() function returns a subset of the DataFrame that includes the data types in include and excludes the data types in exclude.
Example
# A code to illustrate the select_dtypes() function in Pandas# importing the pandas libraryfrom pandas import DataFrame# creating a dataframemy_data_frame = DataFrame({'Id': [1, 2, 3, 4, 5, 6],'Married': [True, False] * 3,'Score': [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]})# obtaining the columns by their data typesbool_data = my_data_frame.select_dtypes(include=bool)int_and_float_data = my_data_frame.select_dtypes(include=[int, float])# obtaining the columns by excluding data typesdata_without_int_values = my_data_frame.select_dtypes(exclude=int)# printing resultsprint(bool_data)print(int_and_float_data)print(data_without_int_values)
Explanation
- Line 4: We import the
DataFramefrom thepandaslibrary. - Lines 7–9: We create a
DataFrameobject,my_data_frame. - Lines 12–16: We obtain the columns of the DataFrame based on their data types using the
select_dtypes()function. We assign the results to thebool_data,int_and_float_dataanddata_without_int_valuesvariables.
- Lines 19–21: We print the values of
bool_data,int_and_float_dataanddata_without_int_values.