Demystifying Causality — How to Move from Guesswork to Knowing
If researchers agree that cause and effect is important, why is it not the standard and correlation is — the blocker, is it philosophical or logistical?
The blocker to causality (not being used by researchers), is a form of stupidity, basic ignorance. Even though somebody would be “brilliant” in for example marketing, it often just means to be skilled at getting ahead in the industry by focusing on ones career obsessively. Such a person is very unlikely to have trained themselves in seeing clearly. By seeing clearly I mean to be able to see the interconnectedness of things in a non-fallacious way.
There is a way to demonstrate this clearly. The things needed for this demonstration:
- any random object that can fit in to your hand
- one person that is generally considered intelligent
- one of your hands
You start by showing the object to the person, and ask them to pay attention carefully. Then you take your hands behind your back, close the object to your hand so that only a tiny fraction of it is visible, bring your hands back in to the vicinity of the person and ask “is this the same object”. As with any question, the options are:
- don’t know
I’ve never met anyone who answers no, which is good, because that is not the right answer in this case. “Right” in this case means a valid statement. Do you know what the valid answer is?
From the causality vs. correlation standpoint it gets pretty interesting now. Let’s look at three possible answers:
- those that invalidly state that the object in your hand is same with the object they had seen previously
- those that invalidly state that the object in your hand is not the same the object they had seen previously
- those that invalidly state that they do not know if object in your hand is the same object they had seen previously
Clearly because all of these are invalid, we are still missing the right answer, because causality dictates that there is always necessarily a valid answer to all questions that investigate phenomena. Do you know by now what the correct answer is?
“those that validly state that they do not know if the object in your hand is the same”
The important point is to show enough of the object so that a visual correlation is possible. But not even close to the entire object, where as the subject (the person) in this test can say “yes there is an undeniable causal relationship between the object I saw a moment ago and the one in your hand”. There is an important notion to be made here regarding the object. If the object is something that could easily be replaced with something highly similar while it is behind your back, there is no validity in stating it is the same. Validity could be achieved with “can’t be sure”. This is because one of the causes of the initial object of perception is that it has not been hidden from the view of the person, once it has been hidden (behind your back), the causes are no longer the same and therefore the valid object of perception can not be same either. Simply put, the causes and conditions that lead to the observation (perception) of the object are different in the two cases. The certainty about the object in a causal sense, not being able to see it clearly after previously having seen it clearly, is dependent on retaining the initial state of seeing it clearly. The exception is where the object can be otherwise authenticated. An example of this would be for example a piece of paper where a “secret code” was first written by the person, copying of which would take much longer to reproduce by any means than the time you keep the object hidden from sight behind your back.
So ok, what is the difference between invalidly and validly stating the same, in this case “I don’t know”. This is the key difference between correlation and causality. Where as through correlation we may answer “I don’t know” because we know that it’s impossible to say with the information we have available, with causality we can go one step further. As opposed to making educated guesses using correlation, with causality we can say things certainly. While that sounds great, it actually means we end up answering “I don’t know” more often (much more!) than with correlation. Also as opposed to correlation where we can have varying degrees of “knowing”, with infinite number of possible rates between no and yes, with causality we only have yes, no and I don’t know. It is the guess-work nature of correlation that also makes it impossible to ever say “definitely yes” or “definitely no” using such a method.
In short summary, correlation is rarely resulting in “I don’t know” and never in “definite yes” or “definite no”, where as causality always results in one of the three.
what makes a statement valid
The aspect of validity in causality has to do with the way the argument is structured. First and foremost, there needs to be clear understanding of what is logic, not in the dictionary definition of the word sense, but in the formal, philosophical sense. Most importantly, it has to be clearly understood what is a syllogism.
an instance of a form of reasoning in which a conclusion is drawn (whether validly or not) from two given or assumed propositions (premises), each of which shares a term with the conclusion, and shares a common or middle term not present in the conclusion (e.g., all dogs are animals; all animals have four legs; therefore all dogs have four legs).
This is a simplistic view of valid inference, a technique we will investigate below. Inference is like syllogism, and valid means that it is necessarily valid. The process which was used in order to validly reach a certain output. The fact that the output is the same as it would necessarily be using valid inference, in which case there can never be more than one possible output, does not constitute validity in the causal sense. The validity is in the process which was used to arrive at the output.
Even the output is the correct answer, if it was not inferred using a valid syllogism / valid inference process, then as inference (causal determination) it is invalid. Because it’s based on luck and not repeatable process.
valid inference — the process
The way I’ve learn valid inference is that there needs to be “three modes”.
- presence of the reason in the subject
- forward pervasion
- reverse pervasion
The first mode is presence of the reason in the subject, for example if we see smoke coming out of a building and argue that therefore there must be combustion present, reason is “combustion” and subject is “smoke coming out of a building”. So in the case of the statement “because there is smoke coming out of the building, there must be combustion also” we can find that the reason is present in the subject when we investigate it.
The second mode is forward pervasion ie. a logical undeniable relationship, in this case “when there is smoke, there is combustion.”
Then the third mode is the opposite, reverse pervasion, “if there is no combustion, there can be no smoke”.
The first makes sure the sentence itself is logical, the second and third make sure that it’s true without exception. Once all three are true, then the inference is valid. Otherwise there is necessarily doubt.
This particular technique was devised almost 2,000 years ago. Greeks had similar methods even before that, perhaps explained less clearly, but leading to the same outcome i.e. validity of inference.
It’s pretty tragic when you think that while the only way we can be sure of things, and make definite statements, is through such a process, the process is more or less completely unknown to all but a handful of practitioners.
non-causality is a fool’s errand.
The blocker, which set the premise for this commentary in the title of it, is lack of training (ability), lack of motivation (too busy, too greedy, etc) and lack of triggers (very few expect or ask for validity). Those three being the three aspects of behavior, such behavior does not manifest. Because valid inference manifest so rarely, virtually all decisions are made based on correlation and not causality. With correlation there is always necessarily doubt.
I’m not suggesting that it’s necessarily a bad thing to have doubt, for most things that is fine. But those making decisions or influencing the decisions of others should be clear about this.
What I would like to do, is conduct some form of a survey or test that proves how 99.99% of “data scientist” are not able to apply causal analysis (valid inferential process) consistently on even simple challenges. Then on that basis to prove that the same 99.99% is not even able to explain the process when given an opportunity to do so using free text.
In the meantime, I will keep sharing this wikipedia page with everyone I have the chance to share it with:
I added one of my own “scientific bias fallacy” which applies when a person believes that it can’t be true because there is no scientific evidence about the topic. It was removed pretty fast.