Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

tidyr
spread
r programming
communitycreator

What is the tidyr spread() function in R?

Salman Yousaf

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Answers Code

Overview

The spread() function from the tidyr library can be helpful to spread a key-value pair across different columns. This function also helps reshape the data from long format to wide format.

This function works exactly opposite of gather(). So, it is used contrarily. Moreover, gather() is used to merge multiple columns and collapse them into key-value pairs. While spread() lays out each key-value pair across multiple columns.

Working of the spread() function

In the above illustration, spread() will split key=genus and value=mean_weight into Baiomys, Chaetodipus and Dipodomys features. It will convert a long format data into a wider form.

Syntax


spread(data, key, value)

Parameters

  • data: This shows the name of the DataFrame.
  • key: This shows the column whose values will become the names of variables or newly created columns.
  • value: This contains the single column values for converting to multiple columns’ values. Thus, these values of a single column are used to fill the columns created through the specified key.

Return value

It returns a DataFrame or List, depending upon data (argument value).

Code

Let's discuss a coding example regarding the spread method. We'll create a DataFrame and then invoke spread() to separate gather information.

# Demo program to how spread() working
# including tidyr library
library("tidyr")
# creating a DataFrame
df <- data.frame(player_category=rep(c('A', 'B'), each=4),
experience_in_years=rep(c(1, 1, 2, 2), times=2),
statistics=rep(c('points', 'assists'), times=4),
amount=c(13, 7, 10, 3, 21, 8, 58, 2))
# show DataFrame on console
print(df)
# Invoking spread() function
# spreading statistics column across multiple columns
spread(df, key=statistics, value=amount)
Working of spread() function

Explanation

  • Line 5: We create a DataFrame that contains players' records.
  • Line 10: We print the above-created DataFrame on the console.
  • Line 13: We use the spread() method to split the values of the statistics column into different unique columns.

RELATED TAGS

tidyr
spread
r programming
communitycreator

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Answers Code
Keep Exploring