# 6.2 - Sample Size Computation for Population Proportion Confidence Interval

Printer-friendly version
 Unit Summary Margin of Error Determining the Required Sample Size Cautions About Sample Size Calculations

Reading Assignment
An Introduction to Statistical Methods and Data Analysis, (See Course Schedule).

### Margin of Error

Note: The margin of error E is half of the width of the confidence interval.

$E=z_{\alpha/2}\sqrt{\frac{\hat{p}\cdot (1-\hat{p})}{n}}$

Confidence and precision (we call wider intervals as having poorer precision): Note that the higher the confidence level, the wider the width (or equivalently, half width) of the interval and thus the poorer the precision.

One television poll stated that the recent approval rating of the president is 72%; the margin of error of the poll is plus or minus 3%. [For most newspapers and magazine polls, it is understood that the margin of error is calculated for a 95% confidence interval (if not stated otherwise). A 3% margin of error is a popular choice.]

If we want the margin of error smaller (i.e., narrower intervals), we can increase the sample size. Or, if you calculate a 90% confidence interval instead of a 95% confidence interval, the margin of error will also be smaller. However, when one reports it, remember to state that the confidence interval is only 90% because otherwise people will assume a 95% confidence.

### Determining the Required Sample Size

If the desired margin of error E is specified and the desired confidence level is specified, the required sample size to meet the requirement can be calculated by two methods:

a. Educated Guess

$n=\frac {(z_{\alpha/2})^2 \cdot \hat{p}_g \cdot (1-\hat{p}_g)}{E^2}$

Where $\hat{p}_g$ is an educated guess for the parameter π.

b. Conservative Method

$n=\frac {(z_{\alpha/2})^2 \cdot \frac{1}{2} \cdot \frac{1}{2}}{E^2}$

This formula can be obtained from part (a) using the fact that:

For 0 ≤ p ≤ 1, p (1 - p) achieves its largest value at $p=\frac{1}{2}$.

The sample size obtained from using the educated guess is usually smaller than the one obtained using the conservative method. This smaller sample size means there is some risk that the resulting confidence interval may be wider than desired. Using the sample size by the conservative method has no such risk.

For the next poll of the president's approval rating, we want to get a margin of error of 1% with 95% confidence. How many individuals should we sample? (In the last poll his approval rate was 72%).

a. Educated Guess (use if it is relatively inexpensive to sample more elements when needed.)

Z0.025 = 1.96, E = 0.01

Therefore, $n=\frac{(1.96)^2 \cdot 0.72\cdot 0.28}{(0.01)^2}=7744.66$ .

The sample size needed is 7745 people (we always need to round up to the next integer when the result is not a whole number).  Why?  Because we are estimating the smallest sample size needed to produce the desired error.  Since we cannot sample a portion of a subject - e.g. we cannot take 0.66 of a subject - we need to round up to guarantee a large enough sample).

b. Conservative Method (use if the start-up cost of sampling is expensive and thus it is not economical to sample more elements later).

$n=\frac{(1.96)^2 \cdot 0.5\cdot 0.5}{(0.01)^2}=9604$

The sample size is 9604 people.

### Cautions About Sample Size Calculations

1. Why do we need to round up?  Because we are estimating the smallest sample size needed to produce the desired error.  Since we cannot sample a portion of a subject - e.g. we cannot take 0.66 of a subject - we need to round up to guarantee a large enough sample.

2. Remember that this is the minimal sample size needed for our study.  If we encounter a situation where the response rate is not 100% then if we just sample the calculated size, in the end we will end up with a less than desired sample size.  To counter this, we can adjust the calculated sample size by dividing by an anticipated response rate.  For instance, using the above example if we expected about 40% of the those contacted to actually participate in our survey (i.e. a 40% response rate) then we would need to sample 7745/0.4=19,362.5 or 19,363.  In other words, our actually sample size would need to be 19,363 given the 40% response rate.

### Minitab Commands to Find the Confidence Interval for a Population Proportion

1. Stat > Basic Statistics > 1 proportion.
2. Select the Summarized data option button.
3. Enter the Number of trials and Number of successes (events).
4. Click the Options button and type the confidence level.
5. If you want to use normal approximation, check the box. The exact interval is always appropriate. Under the conditions that: $n \hat{\pi}\geq 5$, $n (1-\hat{\pi})\geq 5$, one can also use the z-interval to approximate the answers. The exact interval and the z-interval should be very similar when the conditions are satisfied.

Click on the 'Minitab Movie' icon to display a walk through of 'Find a Confidence Interval for a Population Proportion in Minitab'.

At the Centre Community Hospital is State College, Pennsylvania, it is observed that 185 out of 360 babies born last year were girls. If we assume that this situation is representative of birth gender in the United States, give a 95% confidence interval for the true proportion of baby girls in the United States.

Try to figure out your answers first, then click the graphic to compare answers.

a. Hand Computation.

b. Use Minitab to obtain the exact interval:

The exact interval is (0.4609, 0.5666).

We can see that the two intervals found in (a) and (b) are quite close to each other.

If 3 out 5 randomly sampled premature babies survived, obtain a 95% confidence interval for the survival rate of premature babies at the Center Community Hospital.