Can someone please, please explain this to me? I've never understood why the oft stated line "extraordinary claims require extraordinary evidence" is anything other than a clever saying. Why should it be that things that follow your intuition require any less rigor to prove than those that do not, and vice versa? Presumably there should be no subjectivity to cold hard science; evidence is evidence, and a certain quantity of evidence should reflect fact equally well regardless of how unusual that fact is.
edit: just to note, nowhere in constructing a statistical test is it required that the creator decide how "extraordinary" the null hypothesis is.
It's simply the way Bayesian statistics work: if the prior probability of something happening is very low, then for me to flip from thinking "didn't happen" to "did" it will take some new information that is very powerful.
If you think that's illogical, I'd ask you to consider why a teacher is more likely to accept the excuse "my dog ate my homework" than "aliens kidnapped me and stole it". You seem to be arguing that given that the evidences are equal (a mere statement from a kid), the teacher should properly consider both occurrences to be equally likely.
This is true, but I'd add a caution: just because something seems outlandish or improbable doesn't mean it actually has a low prior probability. Human intuition on what's weird and what's not is not a reliable oracle of prior probability. If you're going to give the prior an actual number, you better base it on actual facts.
In your example, based on existing data, it is indeed fair - some dogs do sometimes eat homework, whereas there are no verified accounts of aliens stealing it. So that's a legitimate adjustment of priors. Particularly if you actually have data on the incidence of paper-hungry dogs.
But in science and philosophy, there's lots of important questions for which we can't legitimately calculate priors, and "it would be too weird" is not at all relevant when determining their values.
But we do have reasonable priors on parapsychology from its wasteland of unreplicated, flawed studies, with no convincing results despite decades of effort.
I believe what the saying means is this: If a theory X is accepted as true, then that implies that some amount of reasonable evidence exists in support of X. If we are to prove ~X, then we must show not only evidence in favor of ~X but also explain why the evidence in favor of ~X is stronger than the evidence in favor of X (e.g. using better instruments results in more precise measurements). The new evidence must be more "extraordinary" in the sense that it must be strong enough to overturn the evidence in support of the "ordinary claim".
The line "extraordinary claims require extraordinary evidence" is just more poetic than the paragraph above.
Why you ask? Because it won't be believed otherwise. Causal proofs of a statistical nature aren't math proofs. They aren't, generally at least, proving fundamental relations that resist all dispute. Rather they are simply persuasive.
To be persuaded of something you already strongly believe is far easier than something you don't believe. And really the key word is persuade. It's not that people can prove the sun will come up tomorrow, but they can persuade you that it will.
.
There are some good replies to this already, but I'd like to add another, essentially of my own. As a professor friend of mine once explained, for some reason people seem to want to accept extraordinary explanations over ordinary ones (just look at many commonly held unfounded beliefs). This might be some innate primordial mechanism of the human brain. I don't know, but if there is any truth to it then we must carefully guard against our own bias to make sure we are not unwittingly seeing the results we want to see, hence the need for extraordinary evidence to back up extraordinary claims.
Since we're talking about extraordinary claims, let's examine a claim that is definitely not true, but very interesting. If we run a study with threshold p=0.05, there is a 1 in 20 chance that we will erroneously report the claim to be true.
Now, let's say ten different scientists are interested in this claim, and they're all going to run their own experiments. The chance that all ten will run an experiment with each reporting "false" is under 60%.[0] Over 40% of the time, at least one scientist will falsely conclude the existence of the phenomenon that definitely does not exist. This is an effect of running multiple independently-considered experiments without aggregating the results.
That's the Bayesian problem that people mention. Another problem entirely comes from which results will tend to get published.
Now let's consider the effect of publishing bias. Let's assume that only 20% of the scientists will attempt to publish their results regardless of the outcome, but they will always try to publish if the (false) phenomenon is shown to exist. This effect alone results in 21% of submissions being incorrect,[1] even though an incorrect result only has 5% likelihood.
Let's additionally assume that a journal will publish a false-but-interesting result 50% of the time, and the true-but-ho-hum result only 10% of the time. The final effect is that 50% of published results for this extraordinary-but-false phenomenon incorrectly report the phenomenon to be true.
Tweak the numbers all you want, but the effects of running multiple independently-considered trials, along with biased publishing, means that we are surprisingly likely to publish false conclusions.
edit: just to note, nowhere in constructing a statistical test is it required that the creator decide how "extraordinary" the null hypothesis is.