Course Overview

Get a basic understanding of the learning outcomes of the course.

Why use R?

With rapid progress in statistical computing, proficiency in using statistical software has become almost a universal requirement in statistical methods courses, albeit to varying degrees. Popular software choices include SAS, SPSS, Stata, and R.

There are three advantages of using R compared with commercial packages like SAS, SPSS, and Stata.

  • R is a well-thought-out, coherent system that comes with a suite of software facilities for data management, visualization, and analysis.
  • A large community of R users constantly develops new open-source add-on packages. There are already over 10,000 of these packages.
  • Finally, perhaps the greatest perk of the software is that it’s free.

There are many reasons why R is preferred to other statistical software packages in higher education. But R’s greatest shortcoming to its widespread use in the social sciences is its steep learning curve.

What’s this course about?

This course seeks to teach learners in the field of social sciences. It covers how to use R to manage, visualize, and analyze data to answer substantive research questions and reproduce the statistical analysis in published journal articles.

What’s different in this course?

This course distinguishes itself from other introductory R or statistics courses in three important ways.

  • First, it intends to serve as an introductory text on using R for data analysis projects, targeting an audience rarely exposed to statistical programming.

  • A second unique feature of this course is its emphasis on meeting the practical needs of students using R to conduct statistical analysis for research projects driven by substantive questions in social sciences.

  • A third unique feature of this course is its emphasis on teaching students how to replicate statistical analyses in published journal articles.

This course primarily explains one continuous outcome variable and relevant statistical techniques, such as mean, a difference of means, covariance, correlation, and cross-sectional regression. So, comprehensiveness in both programming and statistics is purposefully sacrificed for greater accessibility, clarity, and depth. The goal is to make this course accessible and useful for novices in both programming and data analysis.

Learning outcomes

In sum, this course integrates R programming, the logic and steps of statistical inference, and the process of empirical social science research in a highly accessible and structured fashion.

The course will guide us on how to do the following:

  • Use R to import data, inspect data, identify dataset attributes, and manage observations, variables, and datasets.

  • Use R to graph simple histograms, box plots, scatter plots, and research findings.

  • Use R for summarizing data, conducting a one-sample t-test, testing the difference of means between groups, computing covariance and correlation, estimating and interpreting ordinary least square (OLS) regression, and diagnosing and correcting regression assumption violations.

  • Replicate research findings in published journal articles.