Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

r programming

What is the group_by() function in R Programming?

AKASH BAJWA

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Overview

In R programming, the group_by() function is applied on data frames or tables. It groups them accordingly so that various operations could be performed. It works similar to PIVOT Table command in Excel and GROUP BY in SQL.

How group_by() works?

Syntax


group_by(.data, ..., .add = FALSE, .drop = group_by_drop_default(.data))
Syntax of 'group_by()' function

Parameters

It takes the following argument values:

  • data: It represents the data frames or a data table.
  • add: The default value of add is FALSE. But if it is applied to existing data, the value will be TRUE.
  • .drop = group_by_drop_default(.data): It represents the default value for the .drop attribute in the group_by() function. So, by default .data will be .tbl.

Return value

This function returns the given data in grouped form like a table.

Code example

In the code snippet below, we'll group two attributes of mtcars dataset with itself to see how the group_by() function works:

# including dplyr library
library(dplyr, warn.conflicts = FALSE)
# it will chain commands: mtcars and group_by(vs, am) data
by_vs_am <- mtcars %>% group_by(vs, am)
# summarise() will remove previous grouped attributes
by_vs <- by_vs_am %>% summarise(total = n())
# print remaining ungrouped values
print(by_vs)
Using the group_by() function in R

Code explanation

  • Line 2: We load dplyr library in the program, where warn.conflicts = FALSE hides conflict alert due to different loading modules.
  • Line 4: We use group_by(vs, am) to group vs (engine shape, either v-shape or straight) and amm (transmission either automatic or manual) feature of mtcars dataset to itself as %>% forward pipe operator pushes vs and am into it.
  • Line 6: We use summarise(total = n()) to ungroup the grouped values above with mtcars dataset. It returns a tibble with an additional column to keep count of unique entries in vs and am columns.
  • Line 8: We print a 4x3 tibble with vs, am, and, total columns to the console.

RELATED TAGS

r programming

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Keep Exploring