Conceptual Preparation

Starting from this chapter, we will learn how to use R to answer substantive questions using statistical inference. In this chapter, we will focus on questions one could ask regarding one continuous outcome variable of interest. A continuous variable can take on an infinite number of possible values within an interval, such as income. Economic growth will be the variable used as an example throughout this chapter.

Specifically, we will use statistical inferences to answer two substantive questions: What is the average economic growth rate in the world economy? Did the world economy grow faster in 1990 than in 1960? We will discuss in detail how to use statistical inference, sample data, and R to answer each of these two questions. The process learned here can be applied to similar questions about other continuous random variables of interest.

Logic of statistical inference

The main use of statistical modeling is to help us answer substantive research questions by making statistical inferences regarding a population of subjects through the use of sample data. The notion of population refers to the universe of all subjects of interest to an analyst, and sample refers to a subset. Ideally it’s a randomly selected subset of the population. When it is not feasible to collect data on the whole population of interest, statistical inference based on sample information becomes necessary. In short, we use the sample data to compute sample statistics. We use these statistics to estimate the attributes or parameters of the population and then draw inferences about them in a probabilistic manner.

Population parameters of interest

The table below illustrates some population parameters we can estimate and use to draw inferences via corresponding sample statistics. This chapter studies how to make inferences about a population mean and the difference between the population means of two groups.

Statistical inference informs us about population attributes based on the sample data. That is, it informs us about the likelihood of sample statistics capturing population parameters. In other words, we use the available sample information in the right column of the table to estimate unknown population attributes, referred to as parameters in a probabilistic manner, shown in the left column of the table.

Get hands-on with 1200+ tech skills courses.