Search⌘ K
AI Features

Group by Rows

Explore methods to transform data by grouping rows based on one or more variables using the group_by function in dplyr. Understand how grouping affects metadata without altering data frames until summarized. Learn to apply this to datasets like temperature, diamonds, and flights, including counting and summarizing grouped data efficiently.

Example with the temperature dataset

Instead of a single mean temperature for the whole year, we would like 12 mean temperatures, one for each of the 12 months separately. In other words, we would like to compute the mean temperature split by month. We can do this by grouping temperature observations by the values of another variable, and in this case, by the 12 values of the variable month. Run the following code:

R
summary_monthly_temp <- weather %>% group_by(month) %>%
summarize(mean = mean(temp, na.rm = TRUE),
std_dev = sd(temp, na.rm = TRUE))
summary_monthly_temp

This code is identical to the previous code that created summary_temp, but with an extra group_by(month) added before the summarize(). Grouping the weather dataset by month and then applying the summarize() function yields ...