Correlation doesn’t imply causation. Every science writer-especially those of us who cover health or medicine studies–encounters the phrase. It’s long been a staple of science and statistics alike, written and muttered so frequently that it’s nearly achieved mantra status. And it’s catchy: I find that the axiom pops into my brain when I read headlines like this one.
The mantra’s message is simple: A measured statistical correlation between two variables-calculated in any one of a number of ways-doesn’t prove that one variable causes the other. In terms of practical advice, it means that we science writers have to find words to convey what a study actually found. So we can’t report a statistical correlation between X and Y as “X causes Y.” (For a good case study, revisit this case about hormone replacement therapy and coronary heart disease. Or there’s always the Flying Spaghetti Monster.)
There’s an important caveat: The words aren’t tantamount to an overall dismissal of correlation. Correlation is a powerful tool that, when used appropriately, points researchers towards better questions. Causation is almost impossible to prove in many cases, and strong correlations may be as close as scientists can get to establishing that critical mechanisms underly a particular condition or disease.
The phrase “correlation does not imply causation” irks some people because, when administered glibly and possibly out of context, it has the potential to cauterize an interesting discussion before it starts to flow.In this excellently cranky post from October 2012, for example, Slate’s Daniel Engbar calls the maxim a “statistical cliché that closes threads and ends debates, the freshman platitude turned final shutdown.” Engbar smartly tracks the history and usage of the phrase, and it’s worth a read.
As an entertaining thought experiment, I think it’s fun to imagine: What if correlation did imply causation? That is, what if every correlation did describe a cause-and-effect relationship? To really see why this is entertaining-in some sense-consider the following “spurious correlation” found by Tyler Vigen, a Harvard law student whose graphs do a better job than any blog post reminding us that just because two sets of data follow similar patterns, they’re not necessarily related.