Multiple Posts About Multiple Comparisons

So the Anonymous winner of the Neuropublishing Joke Contest has revealed himself now as Todd, brilliant MIT neuroscience graduate student and my officemate at lab.

He wrote an explanation of multiple comparisons in the comments section of the last post, and I thought I'd repost it here because I know y'all want more statistics on the blog. Actually, I should have just made him write the explanation for me rather than trying to do it myself. It's the least he could do for a free book*, right?

Anyways, here's multiple comparions take two.  If you want to see the joke that started it all, click back to the last post.

* Apparently Todd already owns a version of The Graveyard Book, but requested it as a prize to throw me off his trail.  However, I didn't know he owned it, so that strategy failed.  After everything was sorted out, I ended up giving him Pillars of the Earth instead.


Todd's explanation:

I feel compelled to write my own short explanation of multiple comparisons for a lay audience, because I think I'm going to need it again some day...

Imagine you have a quarter and you want to know if it always comes up heads. You flip it 5 times, and it comes up heads every time.

Because you're an expert in stats, you know that that will only happen 1 out of every 32 times with a normal quarter. In other words, the probability of getting that result with a normal quarter is around 3%. In other words, as Livia pointed out very well, we're going to say that we think this is a trick quarter, but we acknowledge there's a 3% chance that we just got a strange set of coin flips.

In scientific terms, the "null hypothesis" is that the quarter is normal. We tentatively "reject the null hypothesis", because there's only a 3% chance of a normal quarter. This is a key point about science -- EVERYTHING is tentative. We're never, ever sure about anything. We can never directly prove our hypotheses are correct, we can *only* disprove other hypotheses. And we always do this while acknowledging there's a certain chance that we're wrong. Hopefully, that chance is vanishingly small, but not always...

Now, on with the story. Say you go to the bank teller and tell them to open up the vault, because you heard a rumor they might have some counterfeit coins in there. You insist that they flip each of their 20,000 coins five times each, and if any of them come up with heads all five times, you're going to call the cops.

See the problem with this? While there's only a 1/32 chance that any one quarter will come up all-heads, when you do this 20,000 times, you expect several hundred quarters to have all-head results, through pure chance alone.

You need to be much more careful with your threshold for a counterfeit coin because you're testing so many, and so you "correct for multiple comparisons". The simplest way of doing this is just to change your mind about when you're sure a coin is counterfeit. If you're satisfied suspecting a single fake coin after 5 throws, you'd require, say, 18 throws to satisfy yourself that the bank really had a bad quarter.

Livia's explanation was great, but if you didn't get it the first time around, maybe that helped?


  1. Go ahead and send me 20,000 coins and I'll check to see if there are any bad ones. I'll make sure it's a thorough check.

  2. Got it :) Thanks for the explanations!

  3. So glad to hear a scientist admit that everything is tentative - what's the deal with those global warming scientists?
    Anyway, Livia I think I saw your name on Elana Johnson's post as one of the winners of her contest. Congratulations - very awesome.

  4. Oh, I think scientists are the most skeptical people in the world, both by nature and by profession. If you ask a good scientist whether he's *sure* the sun will rise tomorrow morning, he'll tell you that it's impossible to be *sure* about anything.

    If you flip a coin 100 times and it's heads every time, it's still possible that it's a fair coin, and that has to be acknowledged.

    At some point, though, when you want to actually apply science to real-life, you have to set a threshold for "Even though I'll never be sure, this is close enough." So even though we're not *sure* of the laws of physics, we can put a man on the moon. And even though we're not *sure* of how bacteria reproduce and spread, we can make some pretty good antibiotics.

    Speaking scientifically, I'm not *sure* that evolution through natural selection explains the origins of our species, or that there is an anthropogenic effect on the climate, or that the sun will rise tomorrow. But if I were a betting man, I'd bet all the money I have on any of those.


  5. Todd -- I think some people in the other sciences would say those of us in the statistical sciences are even less sure than they. There's this running joke -- if you need statistics to interpret your data, you're not doing science. Hehe. And then there's that PhD comic where the student says "I did a K-S test on my data" and the professor goes "Oh, that bad huh?"

    Ah, statistics :-)

  6. A colleague, who is also a statistics mentor and inveterate mischief maker, once pointed out that the curse of multiple comparison error isn't confined to the bounds a single analysis, rather it afflicts one's whole career. Every time you do another hypothesis test there's a higher probability that some test result you've reported is wrong. His suggested work-around for this was to grab passers-by and have them do the analysis instead.

    I think it was at about this time that I decided to renounce hypothesis testing, throw my lot in with the Dark Side, and come out as a Bayesian.

  7. Michael: Your colleague is a very wise man :-P

  8. Sigh, memories from my Stats class... and most of my higher level classes. Anyway, I've been away from the internet so I just read your Love at First Sight post and loved it!

  9. What I like about the study of science, is that it forces us to acknowledge change and uncertainty. Often, what was once believed as truth, is disproved...then what?

  10. Perhaps Todd should run an analysis about how Lottery numbers should be picked. Isn't this the theory that works in reverse to accurately predict numbers? Urban math?