What is the group_by() function in R Programming?
Overview
In R programming, the group_by() function is applied on data frames or tables. It groups them accordingly so that various operations could be performed. It works similar to PIVOT Table command in Excel and GROUP BY in SQL.
Syntax
group_by(.data, ..., .add = FALSE, .drop = group_by_drop_default(.data))
Syntax of 'group_by()' function
Parameters
It takes the following argument values:
data: It represents the data frames or a data table.add: The default value ofaddisFALSE. But if it is applied to existing data, the value will beTRUE.
.drop = group_by_drop_default(.data): It represents the default value for the.dropattribute in thegroup_by()function. So, by default.datawill be.tbl.
Return value
This function returns the given data in grouped form like a table.
Code example
In the code snippet below, we'll group two attributes of mtcars dataset with itself to see how the group_by() function works:
# including dplyr librarylibrary(dplyr, warn.conflicts = FALSE)# it will chain commands: mtcars and group_by(vs, am) databy_vs_am <- mtcars %>% group_by(vs, am)# summarise() will remove previous grouped attributesby_vs <- by_vs_am %>% summarise(total = n())# print remaining ungrouped valuesprint(by_vs)
Code explanation
- Line 2: We load
dplyrlibrary in the program, wherewarn.conflicts = FALSEhides conflict alert due to different loading modules. - Line 4: We use
group_by(vs, am)to groupvs(engine shape, either v-shape or straight) andamm(transmission either automatic or manual) feature ofmtcarsdataset to itself as%>%forward pipe operator pushesvsandaminto it. - Line 6: We use
summarise(total = n())to ungroup the grouped values above withmtcarsdataset. It returns a tibble with an additional column to keep count of unique entries invsandamcolumns. - Line 8: We print a
4x3tibble withvs,am, and,totalcolumns to the console.