Data Preparation

Learn data preparation steps for statistical inference.

We'll cover the following


We need to complete the following tasks for data preparation:

  1. Remove all objects to clean the workspace in R.
  2. Create a Project folder to hold the original Penn World Table data, program, and output files.
  3. Create a well-documented R program to read the original dataset into R.
  4. Inspect the imported data to make sure the original data is imported into R properly.
  5. Clean possible data problems.
  6. Create a new dataset using a subset of the original dataset.
  7. Create new variables for later use.
  8. Install the required add-on packages.

We use the following add-on packages in this section:

  • DataCombine
  • ggplot2
  • Rmisc
  • stargazer

We use install.packages("DataCombine"), to install these packages once first. At the end of data preparation, we have the clean dataset saved into a new R dataset, pwt7g in the Project folder.

Get hands-on with 1200+ tech skills courses.