Challenge 5: Multiple Aggregations (Difficult)

Explore how to apply multiple aggregation functions simultaneously on grouped DataFrame columns using Pandas. Understand the use of .agg() with dictionaries to compute sum, mean, and max statistics by country, and learn how to flatten MultiIndex columns for easy access to the resulting data. By the end, you'll be able to convert grouped data into a nested dictionary format useful for analysis or reporting.

We'll cover the following...

Problem definition
Expected output
Challenge
Solution
Solution explanation

Solution explanation

There are multiple steps to finishing this challenge:

Use the .agg() function to apply multiple functions to different columns in the data. There are multiple methods to call .agg(). The solution here utilizes the method of passing in a dict, where the keys are the column names, and the values are a list of names of the required operations.
Calling .agg() using this method will result in a MultiIndex. So, the trick grp.columns = ['_'.join(col) for col in grp.columns.values] will flatten the index, generating column names as <column>_<operation>, e.g., fans_max.
Now, you have a DataFrame, with the index being the country, and a few columns with the correct names. You call .to_dict(orient='index') to have a dict with the keys as the countries, and the values being a dict of col_name: value.

1.Course Introduction

2.Selection / Filters

3.Grouping

4.Dates Manipulation

5.Apply / Map

6.Merge/Concat

7.Quizzes: Assorted Topics

8.Conclusion

Challenge 5: Multiple Aggregations (Difficult)

Problem definition

Expected output

Challenge

Solution

Solution explanation