*drumroll*
The Grand Prize goes to Anonymous, for the following joke:
A brain scientist, an agent, and an editor walk into a bar...
The brain scientist stubs his toe on the bar and yells, "Ouch! I really felt that in my free nerve endings! My somatosensory cortex is going nuts!" The agent says, "Your screaming has far too much jargon. I can't sell it." And the ScienceDaily editor rewrites it to, "Leading scientists prove toes cause pain; suggest removal."
This one got the most votes from random people I pulled to my blog to help me judge. In an unexpected twist, though, this joke is actually not eligible for a prize because Anonymous chose his/her other two jokes as the official entry. But this just received so many compliments that I wanted to award the prize anyways, if just for bragging rights.
The book prize will go to the runners up (yes, there was a tie).
From Liana Brooks:
Q:How many brain scientists does it make to write a bestseller?
A: None. They taught the lab rat to do it.
Again from Anonymous:
Q: How many brain scientists does it take to write a best-seller?
A: Thousands! Of course, after you correct for multiple comparisons, only a handful are doing any real work.
Congratulations! Both runners up requested The Graveyard Book
Okay, and here's the reason I've been procrastinating on the results. I guess, *sigh*, I'm going to have to explain Anonymous's second joke. I know I'm going to explain it slightly wrong, and some statistician will come out and tell me I'm dumb, and it'll be embarrassing for all involved (where by "all involved" I mean me). But I'll give it a try...
*rolls up sleeves*
In an ideal world, we wouldn't have to do statistics on experimental data. If we were doing an experiment on whether morning or evening testing would result in better scores, one ideal data set would be if all morning tests were better than all evening tests:
Morning: 99, 97, 92, 95, 98
Evening: 85, 90, 82, 70, 88
However, that's never true in the real world. In reality, our data is noisy because of factors like individual variation, testing conditions, phase of the moon, etc. Therefore, rather than a clean difference between the datasets, we usually end up with two overlapping datasets:
Morning: 99, 97, 92, 82, 55
Evening: 85, 92, 70, 95, 88
So see how Morning tests are mostly better, but there's alot of overlap? With datasets like this then, there's two possible interpretations.
1. Morning testing is better on average than Evening testing (ie, the experimental conditions are Actually different)
or
2. The two testing conditions are the same, and the difference you get is just a fluke of the specific samples you took. (ie, the experimental conditions are Actually the same, aka the Null hypothesis)
To get an answer, we perform a statistical test that calculates the probability of getting our data set if the conditions are Actually the Same. This is called the p value. In other words, if the p value is less than 5%, there is a less than 5% chance that the conditions are Actually the Same.
It's standard in the sciences now that if the p value is less than 5%, we conclude that our experimental conditions are probably different.
With me so far?
Okay, so the whole p value and statistics thing works fine if you just do one experiment with one statistical test at a time. However, when you're analyzing brain imaging data, you're interested in a whole bunch of different areas. Usually, we divide the brain into tiny cubes a few millimeters wide, and perform a statistical test on every single one. Now we have a problem, because even if every single one the cubes are Actually the same for the two experimental conditions, 5% of them are going to pass our test, just because of random chance. Say we're testing 100,000 voxels -- that's 5000 voxels that will light up in our brain image due to random chance!
Therefore, for neuroimaging, we have to do a more stringent statistical test, and this is called Correcting for Multiple Comparisons (cuz, we're testing multiple cubes, see?). So if you're doing an expeirment, you might get activations in a whole bunch of voxels, but once you correct for multiple comparions, only a handful are actually activated.
Get it? Funny huh?
Um, get it?
Eh, well, it's really funny to neuroscientists. Just take my word for it.
Thanks to all the good folks who entered the contest. Do go over to the contest and check out all the entries. It was fun :-)
12 comments: