Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unfortunately there are too many useful rhetorical devices in the list, devices that serve a purpose. Ironically, the list starts off with "correlation is not causation," an expression that in most cases is raised for very good reasons -- for example, any popular science story that includes the word "linked" but without adequate qualifiers.

If more people were science-literate, these platitudes wouldn't be necessary. But they aren't, so they are.



I agree. The author says "If there's some specific reason you think a a study is wrong, describe it". Unfortunately, this is very often precisely the problem: many times not in the study itself but in the way it's reported.

"Correlation is not causation" mistakes are not always obvious. Especially when there's ideological bias. E.g: the claim that owning a gun increases the chances that you will be violently killed by X%.


> The author says "If there's some specific reason you think a a study is wrong, describe it". Unfortunately, this is very often precisely the problem: many times not in the study itself but in the way it's reported.

Too true. My recent favorite was a popular account of a marijuana study. The popular article was titled "Marijuana causes psychosis" or words to that effect. The popular account went on about how teenagers went crazy after smoking killer weed. The study itself said, "We don't know whether marijuana use sometimes causes psychosis, or psychosis sometimes causes marijuana use. More study is needed."


"Correlation is not causation" started useful but has become less useful the more cliched it became. And OP is right that it's particularly useless among people who've already heard it.

The whole reason people felt the need to mention "correlation is not causation" is that correlation is evidence of causation. Some people seem to think this catchphrase means the two are unrelated which is also false.


> The whole reason people felt the need to mention "correlation is not causation" is that correlation is evidence of causation.

No, without evidence that assumption is false. Correlation can only be evidence for an unexplained link, and even that is often undermined by desperate researchers' predisposition to offer any detected correlation as though it couldn't result from chance.

Given A and B, absent a plausible causative mechanism, and a correlation between them, possible explanations include:

* Chance -- quick, publish!

* B caused A.

* A caused B.

* An unevaluated cause C connects A and B.

If this seems to go to extremes in skepticism, well, remember that skepticism of new results is -- or should be -- the scientist's job.

Example: http://www.nature.com/nature/journal/v483/n7391/full/483531a...

Title: "Drug development: Raise standards for preclinical cancer research"

Quote: "Fifty-three papers were deemed 'landmark' studies (see 'Reproducibility of research findings'). It was acknowledged from the outset that some of the data might not hold up, because papers were deliberately selected that described something completely new, such as fresh approaches to targeting cancers or alternative clinical uses for existing therapeutics. Nevertheless, scientific findings were confirmed in only 6 (11%) cases. Even knowing the limitations of preclinical research, this was a shocking result."

> Some people seem to think this catchphrase means the two are unrelated which is also false.

But without evidence, without a rigorous scientific evaluation, that's a scientist's default assumption, an assumption that relies on the null hypothesis. Using the null hypothesis, one assumes there's nothing there, that the association between A and B results from chance, then looks for reliable evidence that might lead us to a different conclusion.


> No, without evidence that assumption is false. Correlation can only be evidence for an unexplained link

I did not say it was conclusive evidence; I said it was evidence. I'm well aware that "A is correlated to B" does not prove "A causes B" or even "A causes B or B causes A", but it is a data point in favor.

Saying "We should evaluate other evidence before we decide if A causes B" is reasonable skepticism. Acting as though "A is correlated to B" has no bearing whatsoever on the question of whether A causes B is another matter.

(Not that I actually disagree with most of your post, mind you! The real message of "correlation is not causation" is "don't overrate this specific data point; it's a common mistake". But the realist shouldn't underrate it either.)


> I did not say it was conclusive evidence; I said it was evidence.

But it isn't. The null hypothesis requires us to assume that there's nothing but chance at work, and let evidence force a different conclusion. The fact that A and B appear correlated is not by itself evidence of anything other than chance.

> I'm well aware that "A is correlated to B" does not prove "A causes B" or even "A causes B or B causes A", but it is a data point in favor.

No, this is false. Without testing a hypothesis, and without a careful examination of a mechanism, the correlation has precisely no meaning apart from chance.

Here's an example selected at random from a vast literature that tries to make this point:

http://boingboing.net/2010/12/20/creating-a-phony-hea.html

Title: "Creating a phony health scare with the power of statistical correlation"

Quote: "In the United Kingdom, the more mobile phone towers a county has, the more babies are born there every year. In fact, for every extra cell phone tower beyond the average number, a county will see 17.6 more babies. Is this evidence that cell phone signals have some nefarious baby-making effect on the human body? Nope. Instead, it's a simple example of why correlation and causation should never be mistaken for the same thing."

I could link to a thousand similar stories, many being mistaken for actual scientific results.

> But the realist shouldn't underrate it either.

A realist -- a scientist -- always begins by assuming the association is the result of chance (the null hypothesis), and then examines evidence that might argue for another explanation. This is why all self-respecting scientific papers include a p-value. The p-value describes the probability that the result arose from chance, not the hypothesis under test.

http://en.wikipedia.org/wiki/P-value

Quote: "In statistical significance testing the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true."

Translated into layman's language, the p-factor is the probability that the observation -- the "correlation" -- arose by chance.

A properly educated scientist always assumes the null hypothesis is true, i.e. that the observation arose from chance factors. She then tests this assumption with evidence.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: