6.4 - Practical Significance

In the last lesson you learned how to identify statistically significant differences. If the p-value is less than the \(\alpha\) level (typically 0.05), then we say that the results are statistically significant. In other words, results are said to be statistically significant when the observed difference is large enough to conclude that it is unlikely to have occurred by chance. 

Practical significance refers to the magnitude of the difference. This is also known as the effect size. Results are practically significant when the difference is large enough to be meaningful in real life. 

Example: SAT-Math Scores Section

Test Taking

Research question:  Are SAT-Math scores at one college greater than the known population mean of 500?

  • \(H_0: \mu = 500\)
  • \(H_a: \mu >500\)

Data are collected from a random sample of 1,200 students at that college. In that sample, \(\overline{x}=506\). The population standard deviation is known to be 100. A one-sample mean test was performed and the resulting p-value was 0.0188. Because \(p \leq \alpha\), the null hypothesis should be rejected and these results are statistically significant. There is evidence that the population mean is greater than 500. However, the difference is not practically significant because the difference between an SAT-Math score 500 and an SAT-Math score of 506 is very small. With a standard deviation of 100, this difference is only \(\frac{506-500}{100}=0.06\) standard deviations. 

Example: Weight-Loss Program Section

Researchers are studying a new weight-loss program. Using a large sample they construct a 95% confidence interval for the mean amount of weight loss after six months on the program to be [0.12, 0.20]. All measurements were taken in pounds. Note that this confidence interval does not contain 0, so we know that their results were statistically significant at a 0.05 alpha level. However, most people would say that the results are not practically significant because after six months on a weight-loss program we would want to lose more than 0.12 to 0.20 pounds. 

Example: Change in Self-Efficacy Section

Research question: Do children who are given positive reinforcement in the form of verbal praise experience an increase in self-efficacy?

  • \(H_0: \mu_d = 0\)
  • \(H_a: \mu _d >0\)

Where \(\mu_d\) is the change in self-efficacy measured as the self-efficacy after the intervention minus the initial self-efficacy.

Data were collected from a sample of 30 children at one school. In that sample the mean increase in self-efficacy ratings was 10 points with a standard deviation of 3 points. A one-sample mean test was conducted on the differences and resulted in a p-value < 0.0001. The null hypothesis was rejected so the results were said to be statistically significant. To examine practical significance we would need to evaluate the magnitude of that increase. We know that the mean increase was 10 points, but without more information about the survey that was administered we don't know what that really means. We do, however, know that the standard deviation of the increase was 3. This means that the increase was \(\frac{10}{3}=3.333\) standard deviations. That is an increase of more than 3 standard deviations! Based on the mean increase and standard deviation of that increase, this appears to be a large increase in self-efficacy.

Note that statistical significance is directly impacted by sample size. Recall that there is an inverse relationship between sample size and the standard error (i.e., standard deviation of the sampling distribution). Very small differences will be statistically significant with a very large sample size. Thus, when results are statistically significant it is important to also examine practical significance. Practical significance is not directly influenced by sample size.

Effect Size Section

For some tests there are commonly used measures of effect size. For example, when comparing the difference in means we often compute Cohen's \(d\) which is the difference between the two groups in standard deviation units:

\[d=\frac{\overline x_1 - \overline x_2}{s_p}\]

Where \(s_p\) is the pooled standard deviation

\[s_p= \sqrt{\frac{(n_1-1)s_1^2 + (n_2 -1)s_2^2}{n_1+n_2-1}}\]

Below are commonly used standards when interpreting Cohen's \(d\):

Cohen's \(d\) Interpretation
0 - 0.2 Little or no effect
0.2 - 0.5 Small effect size
0.5 - 0.8 Medium effect size
0.8 or more Large effect size

For correlation and regression we can compute \(r^2\) which is known as the coefficient of determination. This is the proportion of shared variation. We will learn more about \(r^2\) when we study simple linear regression and correlation at the end of this course.