He wrote an explanation of multiple comparisons in the comments section of the last post, and I thought I'd repost it here because I know y'all want more statistics on the blog. Actually, I should have just made him write the explanation for me rather than trying to do it myself. It's the least he could do for a free book*, right?
Anyways, here's multiple comparions take two. If you want to see the joke that started it all, click back to the last post.
* Apparently Todd already owns a version of The Graveyard Book
------------------------
Todd's explanation:
I feel compelled to write my own short explanation of multiple comparisons for a lay audience, because I think I'm going to need it again some day...
Imagine you have a quarter and you want to know if it always comes up heads. You flip it 5 times, and it comes up heads every time.
Because you're an expert in stats, you know that that will only happen 1 out of every 32 times with a normal quarter. In other words, the probability of getting that result with a normal quarter is around 3%. In other words, as Livia pointed out very well, we're going to say that we think this is a trick quarter, but we acknowledge there's a 3% chance that we just got a strange set of coin flips.
In scientific terms, the "null hypothesis" is that the quarter is normal. We tentatively "reject the null hypothesis", because there's only a 3% chance of a normal quarter. This is a key point about science -- EVERYTHING is tentative. We're never, ever sure about anything. We can never directly prove our hypotheses are correct, we can *only* disprove other hypotheses. And we always do this while acknowledging there's a certain chance that we're wrong. Hopefully, that chance is vanishingly small, but not always...
Now, on with the story. Say you go to the bank teller and tell them to open up the vault, because you heard a rumor they might have some counterfeit coins in there. You insist that they flip each of their 20,000 coins five times each, and if any of them come up with heads all five times, you're going to call the cops.
See the problem with this? While there's only a 1/32 chance that any one quarter will come up all-heads, when you do this 20,000 times, you expect several hundred quarters to have all-head results, through pure chance alone.
You need to be much more careful with your threshold for a counterfeit coin because you're testing so many, and so you "correct for multiple comparisons". The simplest way of doing this is just to change your mind about when you're sure a coin is counterfeit. If you're satisfied suspecting a single fake coin after 5 throws, you'd require, say, 18 throws to satisfy yourself that the bank really had a bad quarter.
Livia's explanation was great, but if you didn't get it the first time around, maybe that helped?