Search⌘ K
AI Features

Recap: Data Wrangling

Explore essential data wrangling verbs and learn how to combine and manipulate datasets using dplyr and Tidyverse. Understand how to calculate available seat miles by joining airline data, preparing you for effective data transformation and analysis in R.

We'll cover the following...

Summary table

Let’s recap our data wrangling verbs in the table below. Using these verbs and the pipe %>% operator, we’ll be able to write easily legible code. This code will be used to perform almost all the data wrangling and data transformation necessary for the rest of this course.

Summary of Data Wrangling Verbs

Verb

Data Wrangling Operation

filter() 

Picks out a subset of rows

summarize()

Summarizes many values to one using a summary statistic function like mean(), median(), etc.

group_by()

Adds grouping structure to rows in a data frame; note that this doesn’t change values in the data frame, but rather only the metadata

mutate()

Creates new variables by mutating existing ones

arrange()

Arranges rows of a data variable in ascending (default) or descending order

inner_join()

Join/merges two data frames, matching rows by a key variable

An airline industry measure of a passenger airline’s capacity is the available seat miles, which is equal to ...