Lesson 4: Confidence Intervals

Objectives

Upon completion of this lesson, you will be able to:

  • Construct and interpret sampling distributions using StatKey
  • Explain the general form of a confidence interval
  • Interpret a confidence interval
  • Explain the process of bootstrapping
  • Construct bootstrap confidence intervals using the standard error method
  • Construct bootstrap confidence intervals using the percentile method in StatKey
  • Construct bootstrap confidence intervals using Minitab Express
  • Describe how sample size impacts a confidence interval

This lesson corresponds to Chapter 3 in the Lock5 textbook. In Lessons 2 and Lesson 3 you learned about descriptive statistics. Lesson 4 begins our coverage of inferential statistics which use data from a sample to make an inference about a population. Confidence intervals use data collected from a sample to estimate a population parameter.

Confidence Interval
A range computed using sample statistics to estimate an unknown population parameter with a stated level of confidence

In this lesson we will be working with the following statistics and parameters:

  Population Parameter Sample Statistic
Mean \(\mu\) \(\overline x\)
Difference in two means \(\mu_1 - \mu_2\) \(\overline x_1 - \overline x_2\)
Proportion \(p\) \(\widehat p\)
Difference in two proportions \(p_1 - p_2\) \(\widehat p_1 - \widehat p_2\)
Correlation \(\rho\) \(r\)
Slope (simple linear regression) \(\beta\) \(b\)

Before we being, let's review population parameters and sample statistics. 

Population parameters are fixed values. We rarely know the parameter values because it is often difficult to obtain measures from the entire population.

Sample statistics are known values, but they are random variables because they vary from sample to sample.

Example: Campus Commuters Section

Traffic Jam

A survey is carried out at a university to estimate the proportion of undergraduate students who drive to campus to attend classes. One thousand students are randomly selected and asked whether they drive or not to campus to attend classes. The population is all of the undergraduates at that university. The sample is the group of 1000 undergraduate students surveyed. The parameter is the true proportion of all undergraduate students at that university who drive to campus to attend classes. The statistic is the proportion of the 1000 sampled undergraduates who drive to campus to attend classes.

Example: Annual Income in California Section

A study is conducted to estimate the true mean annual income of all adult residents of California. The study randomly selects 2000 adult residents of California. The population consists of all adult residents of California. The sample is the 2000 residents in the study. The parameter is the true mean annual income of all adult residents of California. The statistic is the mean of the 2000 residents in this sample.

Ultimately, we measure sample statistics and use them to draw conclusions about unknown population parameters. This is statistical inference.