8.1.1 - Confidence Intervals

8.1.1 - Confidence Intervals

On the following pages you will see how a confidence interval for a population proportion can be constructed by hand using the normal approximation method. Using Minitab Express, you will learn how to construct a confidence interval for a proportion using the normal approximation method or the exact method. When given the option, it is recommended that you use Minitab Express as opposed to performing calculations by hand.


8.1.1.1 - Normal Approximation Formulas

8.1.1.1 - Normal Approximation Formulas

For the following procedures, the assumption is that both \(np \geq 10\) and \(n(1-p) \geq 10\). When we're constructing confidence intervals \(p\) is typically unknown, in which case we use \(\widehat{p}\) as an estimate of \(p\).

Note that \(n \widehat p\) is the number of successes in the sample and \(n(1- \widehat p)\) is the number of failures in the sample. 

This means that our sample needs to have at least 10 "successes" and at least 10 "failures" in order to construct a confidence interval using the normal approximation method. 

Below is the general form of a confidence interval.

General Form of Confidence Interval
\(sample\ statistic \pm(multiplier)\ (standard\ error)\)

The sample statistic here is the sample proportion, \(\widehat p\). When using the normal approximation method the multiplier is taken from the standard normal distribution (i.e., z distribution).  And, the standard error is computed using \(\widehat p\) as an estimate of \(p\): \(\sqrt{\frac{\hat{p} (1-\hat{p})}{n}}\). This leaves us with the following formula to construct a confidence interval for a population proportion:

Confidence Interval of \(p\): Normal Approximation Method
\(\widehat{p} \pm z^{*} \left ( \sqrt{\frac{\hat{p} (1-\hat{p})}{n}} \right)\)

Finding the z* Multiplier

The value of the \(z^*\) multiplier depends on the level of confidence. The multiplier for the confidence interval for a population proportion can be found using the standard normal distribution [i.e., z distribution, N(0,1)]. The most commonly used level of confidence is 95%. As shown on the probability distribution plot below, the multiplier associated with a 95% confidence interval is 1.960, often rounded to 2 (recall the Empirical Rule and 95% Rule).

Standard normal distribution showing the z multipliers for a 95% confidence interval

Below is a table of frequently used \(z^*\) multipliers.

Confidence level and corresponding multiplier.
Confidence Level \(z^*\) Multiplier
90% 1.645
95% 1.960, often rounded to 2
98% 2.326
99% 2.578

The value of the multiplier increases as the confidence level increases. This leads to wider intervals for higher confidence levels. We are more confident of catching the population value when we use a wider interval.


8.1.1.1.1 - Video Example: PA Residency

8.1.1.1.1 - Video Example: PA Residency

8.1.1.1.2 - Video Example: Dog Ownership

8.1.1.1.2 - Video Example: Dog Ownership

In Spring 2016, a sample of 522 World Campus students were surveyed and asked if they own a dog. Of the 522 students in the sample, 273 said that they did have a dog. Construct a 95% confidence interval for the proportion of all World Campus students who have a dog.


8.1.1.1.3 - Video Example: Books

8.1.1.1.3 - Video Example: Books

8.1.1.1.4 - Example: Seatbelt Usage

8.1.1.1.4 - Example: Seatbelt Usage

In the year 2001 Youth Risk Behavior survey done by the U.S. Centers for Disease Control, 747 out of 1168 female 12th graders said they always use a seatbelt when driving. Let’s construct a 95% confidence interval for the proportion of 12th grade females in the population who always use a seatbelt when driving.

\(\widehat{p}=\frac{747}{1168}=.0640\)

First we need to check our assumptions that both \(np \geq 10\) and \(n(1-p) \geq 10\)

\(np=1168 \times 0.640 = 747\) and \(n(1-p)=1168 \times (1-0.640)=421\) Both are greater than 10 so this assumption has been met and we can use the standard normal approximation with this data.

Now we can compute the standard error.

\(SE=\sqrt{\frac{\hat{p} (1-\hat{p})}{n}}=\sqrt{\frac{0.640 (1-0.640)}{1168}}=0.014\)

The \(z^*\) multiplier for a 95% confidence interval is 1.960

Our 95% confidence for interval for \(\widehat{p}\) is \(0.640\pm 1.960(0.014)=0.640\pm0.028=[0.612, \;0.668]\)

We are 95% confident that between 61.2% and 66.8% of all 12th grade females say that they always use a seatbelt when driving.

What if we wanted a 99% confidence interval?

Let’s think about how our interval will change. The 99% confidence interval will be wider than the 95% confidence interval. In order to increase are level of confidence, we will need to expand the interval.

In terms of computing the 99% confidence interval, we will use the same point estimate \(\widehat{p}\) and the same standard error. The multiplier will change though. From the plot below, we see that the \(z^*\) multiplier for a 99% confidence interval is 2.576. The standard error is still 0.14, it has not changed because neither \(n\) nor \(\hat{p}\) have changed.

Standard normal distribution showing the z multipliers for a 99% confidence interval

\(99\%\;C.I.:\;0.640\pm 2.576 (0.014)=0.0640\pm 0.036=[0.604, \; 0.676]\)

We are 99% confidence that between 60.4% and 67.6% of all 12th grade females say that they always use a seatbelt when driving.


8.1.1.2 - Minitab Express: Confidence Interval for a Proportion

8.1.1.2 - Minitab Express: Confidence Interval for a Proportion

Before we can construct a confidence interval for a proportion we must first determine if we should use the exact method or the normal approximation method. Recall that if \(np \geq 10\) and \(n(1-p) \geq 10\) then the sampling distribution can be approximated by a normal distribution. If this assumption has not been met then the sampling distribution is constructed using a binomial distribution which Minitab Express refers to as the "exact method." 

To check this assumption we can construct a frequency table. You first learned how to construct a frequency table in Lesson 2.1.1.2.1 of these online notes. Here is another example:

MinitabExpress  – Frequency Tables

To create a frequency table of dog ownership in Minitab Express:

  1. Open the data set:
  2. On a PC: In the menu bar select STATISTICS > Describe > Tally
  3. On a Mac: In the menu bar select Statistics > Summary Statistics > Tally
  4. Double click the variable Dog in the box on the left to insert the variable into the Variable box
  5. Under Statistics, check Counts
  6. Click OK

This should result in the following frequency table:

Tally
Dog Count
No 252
Yes 272
N= 524
*= 1
Video Walkthrough

Select your operating system below to see a step-by-step guide for this example.

From the frequency table above we can see that there were at least 10 "successes" and at least 10 "failures." In this example a success is defined as answering "yes" to the question "do you own a dog?" A failure is defined as answering "no." Because both \(np \geq 10\) and \(n(1-p) \geq 10\), the normal approximation method may be used. In Minitab Express, the exact method is the default method. If there are at least 10 successes and at least 10 failures, then you need to change the method to the normal approximation method.

MinitabExpress  – Confidence Interval for a Proportion (Normal Approximation M

To create a 95% confidence interval of dog ownership using the normal approximation method in Minitab Express:

  1. Open the data set:
  2. On a PC: In the menu bar select STATISTICS > One Sample > Proportion
  3. On a Mac: In the menu bar select Statistics > 1-Sample Inference > Proportion
  4. In this case we have our data in the Minitab Express worksheet so we will use the default Sample data in a column 
  5. Double click the variable Dog in the box on the left to insert the variable into the Sample box
  6. Click on the Options tab
  7. The default Confidence level is 95
  8. Change the Method to Normal approximation because the assumption of \(np \geq 10\) and \(n(1-p) \geq 10\) has been met
  9. Click OK

This should result in the following output:

Method
Event: Dog = Yes
p: proportion where Dog = Yes
Normal approximation is used for this analysis.
Descriptive Statistics
N Event Sample p 95% CI for p
524 272 0.519084 (0.476304, 0.561863)

95% CI for the Population

Video Walkthrough

Select your operating system below to see a step-by-step guide for this example.

What if the assumption of \(np \geq 10\) and \(n(1-p) \geq 10\) is not met?

If this assumption is not met then the exact method should be used instead of the normal approximation method. In Minitab Express, this means that in step 8 above the default setting of Exact method should not be changed.

What if we have summarized data and not data in a Minitab Express worksheet?

If you do not have a Minitab Express worksheet filled with data concerning individuals, but instead have summarized data (e.g., the values of \(\widehat{p}\) and \(n\)), you would skip step 1 above and in step 3 you would select Summarized data. For Number of events enter the number of successes (i.e., \(np\)) and for Number of trials enter the total sample size (i.e., \(n\)). 


8.1.1.2.1 - Video Example: Dieting (Summarized Data, Normal Approximation)

8.1.1.2.1 - Video Example: Dieting (Summarized Data, Normal Approximation)

8.1.1.3 - Computing Necessary Sample Size

8.1.1.3 - Computing Necessary Sample Size

When we begin a study to estimate a population parameter we typically have an idea as how confident we want to be in our results and within what degree of accuracy. This means we get started with a set level of confidence and margin of error. We can use these pieces to determine a minimum sample size needed to produce these results by using algebra to solve for \(n\):

Finding Sample Size for Estimating a Population Proportion
\(n=\left ( \frac{z^*}{M} \right )^2 \tilde{p}(1-\tilde{p})\)

\(M\) is the margin of error
\(\tilde p\) is an estimated value of the proportion

If we have no preconceived idea of the value of the population proportion, then we use \(\tilde{p}=0.50\) because it is most conservative and it will give use the largest sample size calculation.

Example: No Estimate

We want to construct a 95% confidence interval for \(p\) with a margin of error equal to 4%.

Because there is no estimate of the proportion given, we use \(\tilde{p}=0.50\) for a conservative estimate.

For a 95% confidence interval, \(z^*=1.960\)

\(n=\left ( \frac{1.960}{0.04} \right )^2 (0.5)(1-0.5)=600.25\)

This is the minimum sample size, therefore we should round up to 601. In order to construct a 95% confidence interval with a margin of error of 4%, we should obtain a sample of at least \(n=601\).

Example: Estimate Known

We want to construct a 95% confidence interval for \(p\) with a margin of error equal to 4%. What if we knew that the population proportion was around 0.25?

The \(z^*\) multiplier for a 95% confidence interval is 1.960. Now, we have an estimate to include in the formula:

\(n=\left ( \frac{1.960}{0.04} \right )^2 (0.25)(1-0.25)=450.188\)

Again, we should round up to 451. In order to construct a 95% confidence interval with a margin of error of 4%, given \(\tilde{p}=.25\), we should obtain a sample of at least \(n=451\).

Note that when we changed \(\tilde{p}\) in the formula from .50 to .25, the necessary sample size decreased from \(n=601\) to \(n=451\).


8.1.1.3.1- Video Example: Female Customers

8.1.1.3.1- Video Example: Female Customers

Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility