Introduction of Data and Needed Packages
Explore how to import spreadsheet data such as CSV and Excel files into R using Tidyverse functions. Understand the principles of tidy data and how to organize your data frames effectively for easier manipulation and analysis. This lesson equips you with the skills to load and prepare external datasets for data-centric statistical inference.
We'll cover the following...
Previously, we introduced the concept of a data frame in R, which is a rectangular spreadsheet-like representation of data where the rows correspond to observations and the columns correspond to variables describing each observation. We started exploring our first data frame, which is the flights data frame included in the nycflights13 package. We created visualizations based on the data included in flights and other data frames, such as weather. We also learned how to take existing data frames and transform or modify them to suit our ends.
Now, we’ll extend some of these ideas by discussing a type of data formatting called tidy data. We’ll see that having data stored in tidy format is about more than just what the everyday definition of the term tidy might suggest—having our data “neatly organized.” Instead, we define the term tidy as it’s used by data scientists who use R, outlining a set of rules by which data is saved. ...