Filtering
Explore how to apply filter conditions on pandas DataFrames to isolate MLB statistics. Understand using relational operations, string filters, and functions like isin to select rows. Learn to filter based on missing values and combine filters for practical data analysis.
We'll cover the following...
Chapter Goals:
- Understand how to filter a DataFrame based on filter conditions
- Write code to filter a dataset of MLB statistics
A. Filter conditions
In the Data Manipulation section, we used relation operations on NumPy arrays to create filter conditions. These filter conditions returned boolean arrays, which represented the locations of the elements that pass the filter.
In pandas, we can also create filter conditions for DataFrames. Specifically, we can use relation operations on a DataFrame's column features, which will return a boolean Series representing the DataFrame rows that pass the filter.
The code below demonstrates how to use relation operations as filter conditions.
In the code above, we created filter conditions for df based on the columns labeled 'playerID', 'HR', and 'teamID'. The boolean Series outputs have True for the rows ...