Introduction of Data and Needed Packages

Learn what tidy data is and how it’s imported.

Previously, we introduced the concept of a data frame in R, which is a rectangular spreadsheet-like representation of data where the rows correspond to observations and the columns correspond to variables describing each observation. We started exploring our first data frame, which is the flights data frame included in the nycflights13 package. We created visualizations based on the data included in flights and other data frames, such as weather. We also learned how to take existing data frames and transform or modify them to suit our ends.

Now, we’ll extend some of these ideas by discussing a type of data formatting called tidy data. We’ll see that having data stored in tidy format is about more than just what the everyday definition of the term tidy might suggest—having our data “neatly organized.” Instead, we define the term tidy as it’s used by data scientists who use R, outlining a set of rules by which data is saved.

Needed packages

Let’s load all the packages needed for the coming programs.

Get hands-on with 1200+ tech skills courses.