March 7, 2011

Questioning correlations is good statistical hygiene

John Paulos, A Mathematician Reads the Newspaper:
"A more elementary widespread confusion is that between correlation  and causation. Studies have shown repeatedly, for example, that children with longer arms reason better than those with shorter  arms, but there is no causal connection here. Children with longer arms reason better because they're older! Consider a headline that invites us to infer a causal connection: BOTTLED WATER LINKED TO HEALTHIER BABIES. Without further evidence, this invitation should be refused, since affluent parents are more likely both to drink bottled water and to have healthy children; they have the stability and wherewithal to offer good food, clothing, shelter, and amenities. Families that own cappuccino makers are more likely to have healthy babies for the same reason. Making a practice of questioning correlations  when reading about "links" between this practice and that condition is good statistical hygiene."
And the final recommendations:
"If statistics are presented, how were they obtained? How confident can we be of them? Were they derived from a random sample or from a collection of anecdotes? Does the correlation suggest a causal relationship, or is it merely a coincidence? And do we understand how the people and various pieces of an organization reported upon are connected? What is known about the dynamics of the whole system? Are they stable or do they seem sensitive to tiny perturbations? Are there other ways to tally any figures presented? Do such figures measure what they purport to measure? Is the precision  recounted meaningful?"