Search⌘ K
AI Features

Solution Review: Group By Aggregations

Explore how to perform group by aggregations in Python using Pandas, focusing on grouping data by a variable and calculating mean statistics. This lesson guides you through analyzing the Auto MPG dataset by cylinders and computing average miles per gallon, helping you understand data grouping and aggregation techniques.

We'll cover the following...

Group by aggregations #

Python 3.5
import pandas as pd
# Loading dataset
def read_csv():
# Define the column names as a list
names = ["mpg", "cylinders", "displacement", "horsepower",
"weight", "acceleration", "model_year", "origin", "car_name"]
# Read in the CSV file using regex for whitespace separation
df = pd.read_csv("auto-mpg.data", header=None, names=names, sep=r"\s+")
return df
# Describing data
def group_aggregation(df, group_var, agg_var):
# Grouping the data and taking mean
grouped_df = df.groupby([group_var])[agg_var].mean()
return grouped_df
# Calling the function
print(group_aggregation(read_csv(), "cylinders", "mpg"))

According to the problem statement, we need to group the Auto MPG Dataset on the basis of a column. Then we have to calculate the mean of the grouped data according to another column. Before doing it, we have to read the data first. There is no need to explain how to read the data, as we studied that in detail previously. Dataset is read from line 4 to line 9. ...