Lesson 4: Multiple Testing
Key Learning Goals for this Lesson: 

An event that is rare if we have only one opportunity to observe can become quite common if we are observing thousands of events. For example, when you roll 2 fair dice, getting double sixes happens only about 1 out of 36 times. But if you roll 3600 times, you expect to get about 100 rolls with 2 sixes.
The pvalue is the probability of obtaining a result at least as extreme as the observed result if the null hypothesis is true. Suppose we accept p < 0.05 as "extreme". If we do 10,000 (independent) tests, and all the null hypotheses are true, we expect about 5% of the tests (i.e. about 500) to have p < 0.05.
This is a huge problem in high throughput analysis, because we are usually doing thousands of tests. We do not want to waste our time following up false positive hypotheses. But if we use conventional pvalue cutoffs, this will be inevitable.
This chapter discusses some approaches to correcting our inference methods when we are doing multiple tests.