Are these things related?

Dangers in Correlating Trends (June 1, 2023)



I saw a man walking around his yard the other day.  He had an opened umbrella held high.  But it was a sunny day.  So I asked.  He told me that statistics clearly show that the greenness of lawns directly correlates with people’s use of umbrellas.


The next day, I heard of someone who went for an appointment to see his cardiologist.  The doctor indicated that he had a heart arrhythmia and offered a prescription for some medications that would help get it under control.  The patient declined.  He told the doctor that people who use such medications have greater instances of catching Alzheimer’s disease.


These are completely fictitious stories, but I hope you can see the fallacy in each of them.  In the first case, both sets of data were the result of a third set, being rainfall amounts.  However, neither of the mentioned two data sets (umbrella use & green lawns) have any relation to the other.  In the latter example, an increase in Alzheimer’s disease cases is a factor of living longer, a hopeful outcome of cardiac disease treatments.  (Disclaimer: I have never heard one way or the other that any such medications have any relation to Alzheimer’s disease.)


The time-tested rule to always remember is “correlation does not equal causation,”  This means that just because two sets of data move in lock-step, it doesn’t necessarily mean one caused the other.  The only way of truly getting meaning out of correlation is to look for cause and effect relationships.  Usually it would be difficult for you and I to do this when confronted by statistics.  Only the statisticians who produced the data and other subject-matter experts can do this.


Why does any of this matter?  It is because the media sometimes reports on correlations that make a little sense until you think deeper about them.  This happens when the results of a data-based study produces juicy results that will tantalize media consumers.  In my mind, the most dangerous use of the correlation-is-everything approach is in media reporting on less than scientific health studies.  Some are obviously problematic; as obvious as my two examples above.  Some are not obviously right or wrong.  However, until you see reporting that specifically states that the study evaluated all cause and effect relationships in a set of data, you just don’t know.  Look for statistical terms such as ‘multivariate analysis’ or ‘controlling for other variables.’  


Want more examples of this?  Go to the following website: http://www.tylervigen.com/spurious-correlations

The main page contains a number of spurious (false) correlations and shows comparative data in chart format.  The day I visited, the following close yet ridiculous correlations were charted.







Get the picture?  Keep your guard up the next time someone says one thing causes another.  Always look for the cause and effect relationship.  Failing that, remain skeptical.