“Does everything that correlates lead to causation?”

Kalpa Vrikshika
5 min readMay 17, 2018

--

A correlation happening here.

My journey so far in the Udacity Bertelsmann data-science course has only challenged my thinking to greater horizons.

Apart from all the quizzes, “Keeping up with the slack”, assignments and the pressure to finish lessons, I took some time out to appreciate the main philosophy behind “Correlation does not necessarily cause causation”.

As a start I’d like to outline examples and theories that reside with this both learnt within the course and during my wild reading sprees. Beware, I’m no statistician (yet) or an expert in causal theory. It’s just a step to announce my newly learnt knowledge, compel myself to write more, convince myself that I might just know something about this and in the hopes that someone might pick up something from this.

Let’s get started:

The golden arches theory of conflict resolution: No two countries that have MacDonald’s has gone to war since the opening of MacDonald’s — REALLY?

This might have been a poorly theorized and later a regretted doctrine, however, it’s always fun to prove it wrong over and over but, I’m not going to prove it right or wrong with facts in this article. Check out below if you know about that.

Our question is, is this REALLY the reason that the said countries didn’t go to war?

Let’s look at the plausible reasons why they might have not gone to war in the order of silly to not-so silly.

  1. The burgers only made them very happy resulting in no war.
  2. The burger indulgence made them too lazy for war.
  3. The nations opened to globalization- MacDonald’s being the symbol of being part of the global system.
  4. You don’t want to be the bad guy. In modern times, countries would not want to have trade with other countries that are warmongers.

5. The countries cannot afford to go to war because of economical conditions.

“No two countries that are part of the same global supply chain will fight each other: the economic penalty would be simply too high.”

So what might the reason be? It could be any of the above which clearly indicates the correlation of MacDonald’s and war nations might/or might not be the reason, so not NECESSARILY a causation of no war. The reason this might have been is because of the obvious incline slope in MacDonald’s v/s the decline in war - This is a correlation at it’s best, and a negative one at that!

When you see graphs that are perfect symmetrical slopes, the mind wavers to find an immediate pattern and inherently seems like the ultimate cause of the variable. In contrast, the pattern might be spurious of two independent variables with a similar behavior or the seemingly “cause” only being a factor to the result — the easier and quicker you spot a correlation, the more plausible you’re to deem it the ultimate cause. Intuitively, the mind does not take the probabilities, variables, effects, analysis and risks into account before deeming something a cause — it’s just almost feels too intimidating which is where part of the problem lies.

Take for example on of the most scoffed at theory: Smoking causes lung cancer — Great! This is statistically correlated , of course! However, what if we add in one more factor that genetic inheritance also could be a cause of lung cancer? That does change the game mildly doesn’t? but here’s where the brain stalls and decides to retract to the most primitive and immediate thought of what seems to be easily correlated.

I must confess, I did some immediate finger-pointing at the graph and quickly murmured the cause just like you did. Now, after some readings and hopes to change my mental model, I’m starting to consciously slow down in my heuristics at coming to a conclusion.

Here are some random statistical fallacies you can come upon and later only becomes a superstition or “lucky charm”.

  1. Correlation: One day having worn a red top to the exam, I got an A. Fallacy: I now wear a red top to all my exams so as to get an A.
  2. Correlation: ‘As I was whistling, the rain started to pour’. Fallacy: The rain pours every time I whistle.

(Can you relate to this, as much as I can?). It’s always very easy to become prey to such fallacies that it ultimately destroys our perception of the actual cause. Perhaps I got an ‘A’ because I genuinely studied for the exam and the rain pouring while I was whistling was merely a coincidence.

IT’S TRUE BECAUSE IT STATISTICALLY CORRELATES

“It’s true because it statistically correlates” — is an excuse often used when justifying anything, and eventually proving your hunch right, also known as confirmation bias. Well then, there is statistical co-relation between the divorce rate in Maine and the consumption of margarine, or that per capita cheese consumption correlates with number of people who died by tangling in their bed sheets. Check out below to have a good laugh at these spurious fallacies.

Be wary when shouting conclusions from just data.

I’m sure it clearly stand out that the correlation in the above is NO causation, because these examples are far too absurd and silly, but this is only given to you an extreme of what is not so as to show what is— also known as negative proof or proof of impossibility.

In the coming up articles, I’ll try explore on how to identify causal theories from fallacies or mere correlations as I’m currently reading (excessively) on that.

Until then let’s appreciate that correlations do exist, and that we do make immediate assumptions from them. Let’s also appreciate that, that is not the right method all the time and that we need to consider the lurking factors before spurting out “It’s proven statistically!”.

--

--

Kalpa Vrikshika

~Data foundations graduate~ ~Udacity Bertelsmann Data Science Scholar~ ~Believing until I become it~ ~Happy place~