Trusted answers to developer questions

What is tidyr gather() in R?

Free System Design Interview Course

Many candidates are rejected or down-leveled due to poor performance in their System Design Interview. Stand out in System Design Interviews and get hired in 2024 with this popular free course.

Overview

The tidyr package in R helps create tidy data, providing different build-in functions used for data cleaning.

gather() is used to gather multiple columns and collapse them into key-value pairs. It is invoked from the tidyr package with different argument values, as shown below.

How does the gather() function work?

In the graphical illustration above, we've gathered the cinnamon_1, cinnamon_2, and nutmeg_3 columns into key and values pairs. We created a new column, spiceKey, as a merged key, and observations for Emma, Harry, Ruby, and Zainab as correctValue.

Syntax


gather(data, key, value, ….)

Parameters

It takes the following argument values.

  • data: The name of the DataFrame.
  • key: The name of the key column that is to be created.
  • value: The name of the value column that is to be created.
  • : This specifies the column from which the key-value pair will be gathered.

Return value

It returns merged information of the same type as data (an argument value).

Explanation

Let's look at an example of the gather() function below:

# demo program to show the working of gather()
# importing tidyr library
library("tidyr")
# creating a DataFrame
df <- data.frame(players=c('Amber', 'Paisley', 'Roxanne', 'Scarlett'),
Year_2012=c(12, 5, 7, 19),
Year_2013=c(15, 25, NA, 29))
cat("Before Gather() Called\n")
#viewing data frame
print(df)
cat("\nAfter Gather() Called\n")
# gathering the data from columns 2 and 3
gather(df, key="Year", value="Matches Played", 2:3)
  • Line 5: We create a DataFrame that contains three columns and five observations.
  • Line 10: We print DataFrame before the gather() function is called.
  • Line 13: This line of code shows how the gather() function will combine the players as the key and Year_2020, Year_2022 as values. Finally, we print the new DataFrame on the console.

RELATED TAGS

gather
tidyr
r programming
r
Did you find this helpful?