Pandas DataFrame Operations - Grouping and Sorting
Explore how to perform grouping and sorting operations on Pandas DataFrames to efficiently summarize and analyze data. Understand how to apply aggregation functions selectively, such as sum and mean, to uncover insights like total revenue or average ratings by categories. Develop proficiency in sorting grouped data to identify highest values and trends, enabling clearer data interpretation and decision-making.
We'll cover the following...
6. Grouping
Things start looking really interesting when we group rows with certain criteria and then aggregate their data.
Say we want to group our dataset by director and see how much revenue (sum) each director earned at the box-office and then also look at the average rating (mean) for each director. We can do this by using thegroupby operation on the column of interest, followed by the appropriate aggregate (sum/mean), like so:
As we can see, Pandas grouped all the ‘Director’ rows by name into one. And since we used sum() for aggregation, it added together all the numerical columns. The values for each of the columns now represent the sum of values in that column for that director.
For example, we can see that the director Aamir Khan ...