Knowledge and Logic— [Part 2] of Morality
Previous essay in this series: Prologue
I think, therefore I am — René Descartes.
This was Descartes’ way of saying that you cannot doubt your own existence. The very act of doubting your existence proves that you exist, because if you did not exist, you wouldn’t be able to doubt anything. This is the general explanation given to Descartes’ proposition. You will find that the essence of this quote has a lot to do with the content of this essay. The quote actually even summarizes the title of this essay — It think (a fact — Knowledge), therefore (a deduction — Logic) I am (a fact— Knowledge)
Let us rephrase and unwind the original quote in a more technically accurate way:
The very act of subjectively experiencing doubt in your own existence proves that it is true that you exist because you couldn’t have otherwise experienced the doubt.
We can go even further:
The very act of you subjectively experiencing doubt in your own existence puts you in a position where you have no other option but to logically deduce that it is true that you exist and thus subjectively experience the belief that you do exist because it is impossible for you to doubt without existing in the first place.
Presumably, many will agree that the subsequent rephrasing of the original quote does not add any substance to the original statement made by Descartes. The subsequent expansions are, arguably, needlessly long; and so we choose to use words as shorthand to replace long phrases and make the message crisp and easy to communicate and remember. This is the reason unravelling the original quote is sometimes so important — because it can be possibly enlightening. This unravelling shows us what substitutions have been used and what the words we used as shorthand mean in the first place. For example, ‘I think’ is a shorthand for ‘the act of subjectively experiencing doubt or any other such experience’. ‘therefore’ means ‘it proves that’ which in turn means ‘we have no other option but to logically deduce that’. ‘I am’ means ‘It is true that I exist’ which in turn means ‘I know that it is true that I exist’.
The aim of this essay is to build the underlying meta-logical, logical and axiomatic foundations on which we can hope to construct a moral framework. The hope is that these same foundations will help us answer not only moral but many other philosophical dilemmas.
What is knowledge? Can a universe without life have knowledge? When the earth was teeming with single-celled organisms, was there any knowledge on earth? At what point in the history of the universe did the first quantum of knowledge come into existence? These are very interesting questions. But these can be answered only if we unambiguously define knowledge. And to do that you and I, need to go on a pseudo-chronological journey.
At the begin, when the universe arguably came into existence as a primordial atom, there was nothing to describe in it. There were presumably just infinities everywhere — infinite mass, infinite energy, infinite density, infinite temperature, infinite space-time curvature etc. It is even debatable whether talking about concepts like mass, energy and temperature makes sense in such conditions. But once the universe began expanding, and tiny quantum fluctuations presumably led to various asymmetries and inequalities, and, as a result, things started to look somewhat like the universe we see today, matter, energy, force-fields, and space-time started interacting with each other and arranging themselves in relation to each other in specific ways. For example, a particular class of wave-particles carrying electro-magnetic energy across space-time, which we today call photons, began moving around at 299792458 m/s in vacuum. Atoms were formed with specific configurations which had direct consequences on how those atoms interacted with other atoms. Later when life was formed, the A,C,G & T molecules arranged themselves in different ways along the double helix called DNA — this arrangement is the signature of that living being. All these configurations, arrangements, behaviours, constants etc. are what we call information. The cumulative information of everything that exists is the signature of this universe. We also call this information Objective Reality.
From Information to Knowledge
Now let us say you and me i.e. two cognitive beings, somehow come into existence in this unknown environment we now call earth. Imagine that this is the beginning of our existence. We know nothing about our surrounding environment except for, lets say, having some basic survival instincts. This is when things become much more complex. Cognitive beings have the faculty of memory. In those memories, we have the ability to store and retrieve entities called statements. We call the cumulation of all these statements that we store in our memories as knowledge. And this ability to possess knowledge (in conjunction with the ability to act — volition) is a tool we cognitive beings use to navigate through our existence. For example, if your senses show you a tiger near the horizon approaching you — you have the ability to store this as a statement and act on it… run! (assuming you also know that tigers are dangerous — a very useful piece of knowledge) Likewise, when we see a cup at the edge of a table, we realize that we already know that cups at the edge of tables can fall and break with a high probability. From this knowledge, we decide to act to put it in a more secure place. The cup on the other hand couldn’t have avoided its fate by itself. Only cognitive beings like humans can perform such pre-emptive operations thanks to our ability to store and retrieve knowledge.
It’s both important and interesting to note that the story we told ourselves about objective reality is just that — a story. We look around and our senses tell us that there is something outside of us which fits the description of something like objective reality, so we accept it as a plausible story. But we can’t be sure. There is one thing though that you can be sure of — at this point in time, you exist. You think, therefore, you are. This is our first piece of knowledge — the one statement which is just true by definition — undoubtable, not contingent on any thing else but that statement that we are contemplating this statement. At this point in time, this is the only thing you and I can be sure of. You that you exist, and I that I exist.
The above crude definition of knowledge and our ability to possess it leads to something really interesting — the need for truth. For example, in the above story, thanks to your knowledge, you escape the approaching tiger and survive to tell the story. Now, let us say, it so happens that you sometimes see ghosts attacking you at night and as a result get panic attacks. You know that a ghost has never harmed you physically. But the sight still causes you fear and suffering. You realize that this is possibly because the statements ‘ghosts exist’, and ‘ghosts are dangerous’, reside in your memory. While the statement that tigers are dangerous saved your life, the statement that ghosts are dangerous is causing you unnecessary suffering. All statements don’t seem to be equal. We need a way to deal with these two classes of statements differently. We need an effective way to reject the knowledge which is detrimental to our existence and to retain the knowledge that helps us. Without this ability, knowledge can seize from being a tool and can become a burden instead. One way to do this is to label each statement.
There are many, possibly infinite, ways to do this. The simplest way to do this is to mirror our requirements: We need knowledge to be in one of two states — acceptable or rejectable. So let us create two labels — True and False. These words will inevitably come with a huge mental baggage because we are so much aware and accustomed to their usage in our cultures. But we need to curb the temptation to interpret these labels in this context. We need to remember that at this point, all we know is this: We are cognitive beings, we can store and retrieve knowledge, but we do not want to treat every statement equally— so we decide to label statements so that we can reject some and accept some. It so happens that we chose the label True for accepting and False for rejecting. We could have chosen any two labels; it wouldn’t have made any difference as long as we treat them just as labels because that’s what they are. But because we, as a human civilization, have historically chosen True and False to be those labels, let us use them and see if the way we use them today is consistent with the foundations on which their existence depends in the first place. At this point we have built the first foundation of the system which we generally call classical logic.
Now that we have chosen the two labels called ‘true’ and ‘false’, let us say we begin sifting through all the statements that we know and assign them a label (their truth value — either true or false), so that we are in a position to reject the ones which are false and retain the ones which are true. Also, we decide that every time we encounter a new statement, we will try to assign it its truth value — a way to keep our knowledge organised. But there are cases where this might not be feasible. We may not have any way to assign a label to some statements because it might not be obvious at first glance if a statement is acceptable or rejectable. In such cases, we don’t assign the statement any label and wait for a time when we would be able to do so — until then, they are ‘unknown’. We have thus augmented our initial bivalent system to transform it into a trivalent system - ternary logic.
We have now covered three kinds of statements — statements which are labelled true, statements which are labelled false, and statements which are not yet labelled (having a truth value of ‘unknown’). It follows that we might now wonder about a fourth case — what about statements which are both true and false? Firstly, what do we do with such a statement? We cannot do anything with it in the current scheme of things because we cannot both accept and reject a statement. It seems that labelling a statement as both true and false is of no use to us. It undermines the very purpose of choosing the two labels in the first place — to be able to treat the two classes of statements differently. Here, we have a choice: we either choose to allow for statements which are both true and false and just hope that we don’t end up in a situation where most of the statements we know end up being both true and false and hence useless, or we decide to be stricter— we adopt a rule that we will not allow for our system to have statements which are both true and false.
Among the above two options, the later option is one of the first laws of classical logic — the law of non-contradiction, which states that a statement cannot be both true and false. As intuitive, obvious and commonsensical as it might seem that something cannot be both true and false at the same time, it is important to realize that such intuition and obviousness might be coming from our prejudices about truth and falsehood (as a great person probably once said, common sense is the collection of prejudices acquired by age eighteen). At this point, the decision to not allow a statement to have both the labels true and false is fundamentally arbitrary. And it is important that we acknowledge this — so important in fact, that we should give it a name.
The principle of arbitrary choices
When building a logical system, we might have to make arbitrary choices. The arbitrary choices take the form of ascertaining that certain actions are allowed or that certain statements are true in that system. These arbitrary choices are called laws and axioms of the logical system. Every time we make such a choice, we create a new logical system on top of the existing one (This leads to the hierarchy of logical systems). Let us review what choices we have made till now.
The choice of truth: This is the first choice we made. We were born cognitive beings — with memory and volition and the ability to ‘know’ statements. We had no choice in the matter. But the moment we decided that we needed to differentiate statements into two classes, we made an extremely consequential choice. If we wouldn’t have made this choice, the world wouldn’t have had logic. This is because without this choice, there is no logic. This is the reason you will never find a logical system which doesn’t have a concept similar to that of truth built into it and this is why this choice is the ‘primordial choice’.
Just imagine a world where cognitive beings didn’t believe in the concept of truth. It would be very different from the world we humans live in. Such a world is possible if it has no lying, deception (self or otherwise), no illusions or biases, no mirages, no cons, no pretence, no imagination, no hypotheticals, or no abstract thinking . In such a world, when cognitive beings encounter a statement, there is no reason for that statement to be rejectable. In such a world, there is no need to label statements, all statements are true by default.
The choice of the number of labels: We needed at least two labels by requirement. We chose two (‘true’ and ‘false’) even though we could have chosen more. This choice is the first step towards building classical logic: the principle of bivalence.
The choice of allowing ‘unknown’ statements: After some consideration, and some dissatisfaction with being forced to use only two labels, we ‘felt’ that two labels was one too few. So we weakened the principle of bivalence to add a new label called ‘unknown’, and embraced trivalence. (Note that principle the bivalence still holds in trivalent systems — a statement can only be either true or false — nothing else. Trivalence just allows to use a placeholder label till we can ascertain the truth value of a statement).
The law of non-contradiction: After further consideration, we chose to adopt a rule that prevents a statement from having both true and false labels. Making this choice is a momentous point in the pseudo-chronology of building logic. This single choice, as some of you might know, leads to huge consequences. This choice might seem like an obvious one to make, but there might be reasons not to make this choice. In mathematics, which is a branch of logic, this law looms large. Amazing human intellects have been striving hard over the millennia to make sure this law is upheld. Mathematicians have written treatises, some life’s worth, only to all be rejected and thrown into trash because they broke this law. Mathematicians had been trying for ages to use their logic (mathematics) to prove that the whole of the current mathematical knowledge conforms to this law. Until Gödel, in 1931, published two mind-bending results. The first result proved that there are truths in some logical systems which cannot be shown to be true by deduction using the same logical system and that the cumulation of all mathematics was one such system. The second result proved that the statement ‘the whole of mathematics does not break the law of non-contradiction’, in other words, ‘mathematics is self-consistent’ is one such statement which cannot be proven to be true using mathematical deduction; it is potentially eternally ‘unknown’: it is unknowable. Though it might not be possible to use logic to deduce that maths is consistent, it is possible deduce that a mathematical statement is false, when it had already been proven to be true, which in turn shows that maths is inconsistent. So there is a way to know if ‘maths is consistent’ is false. Until then, it is ‘unknown’ — it can never be proven to be true. If this doesn’t blow your mind, you are either too familiar with this result, or you have not completely understood its implications.
The non-triviality of the effort in making sure the law of non-contradiction is not broken and the implications of Gödel’s incompleteness theorems together prompt some to invest in backtracking this choice and making the alternative choice of allowing some contradictions. Note that you cannot allow all statements to be contradictions, only some, because otherwise you will be destroying the very foundations on which logic itself is built. Such systems which go down this less trodden path are called paraconsistent logics. In these systems, the law of non-contradiction states that there are at least some statements which cannot be both true and false. A statement which can be both true and false is called a dialetheism.
This section is called Logic, and we have been using this word freely up to this point. But now it is time to ask — what is logic?All the choices made before the law of non-contradiction just built the chassis of a system. But the engine was still is missing. The law of non-contradiction is the first such choice which gives a dynamic quality to the system— the ability to generate new knowledge. And such a system which has the power to generate new knowledge is called logic. Once we adopted the law of non-contradiction, we were able to generate a new statement along with its truth value — ‘I exist’ is true. This power of using logic to generate knowledge is one of the greatest powers any cognitive being with an intellect can possess.
Before the choice of truth, we are just cognitive beings going through the motions of existence. But once we make this choice, a great burden is upon us: the burden of assigning truth values to statements. We can go about assigning truth values individually to every statement we encounter. But this is not optimal. In a world which is as massive, complex, and confusing as ours, it seems pragmatic that, if possible, we build a system of rules which can automate, expedite and greatly simplify the process of truth value assignment —this is the process of knowledge generation. We call such a system logic. And to build such a logic, we adopt rules, called laws of logic. We already adopted the law of non-contradiction and used this law to deduce that ‘I exist’ is true. We can make this system of logic more robust, powerful and capable of ascertaining the truth values of many more such statements. All we need to do is adopt more laws which can power the system. These laws form the foundation of the system we call classical logic:
- Law of identity: Everything is the same as itself.
- Law of non-contradiction: A statement cannot be both true and false (mutual exclusivity).
- Law of excluded middle: If a statement is not true, it has to be false. If a statement is not false, it is has to be true (joint exhaustivity).
- Logical conjunction: ‘A ∧ B’ is true if and only if both A and B are true. ∧ is called logical conjunction. Colloquially it is called the logical AND.
- Logical disjunction: ‘A ∨ B’ is false if and only if both A and B are false. ∨ is called logical disjunction. Colloquially it is called the logical OR.
- Material Conditional: ‘A → B’ is true if and only if ~A ∨ B is true. As a corollary, if ‘A → B’ is true, and A is true, it can be deduced that B is true. Because this law gives us the power to deduce truths, we can call it the law of deduction.
- Monotonicity of entailment: If A implies B, then A and C together also imply B
- Commutativity of conjunction: ‘A ∧ B’ is same as ‘B ∧ A’ which is the same as ‘A ∧ B’.
As the principle of arbitrary choices tells us, these laws are fundamentally arbitrary from the perspective of the logical system we are building, and it is very important to keep in mind that every one of these choices (or non-choices, as the case maybe) will fundamentally alter the nature of the system we operate in. If we wish to better understand the consequential importance and sheer power that a law can have, we can reconsider some of them to see how not choosing them changes things.
Law of identity: This is the first of the three laws of classical thought. It is hard to even start to imagine how we can do away with this law. Everything is the same as itself. If some things are not same as themselves, what are they? What can they be? Can we even talk about such a thing? This might be the reason this is called the fundamental law of knowledge.
Law of non contradiction: This is the second of the three laws of classical thought. It states that no statement can be both true and false. We already saw how allowing all statements to be both true and false trivializes the whole concept of truth. But why restrict every statement from being both true and false? Why not allow this for some statements, and prevent it for others? Surely this will not trivialize the whole concept of truth. Isn’t it? Why adopt a law which restricts us so much? There is a reason for this. Let us assume we want to build a logical system which adopts the law of logical disjunction.
- Let a be both true and false.
- since a is true, a ∨ q is true. (by logical disjunction)
- as a is also false, q has to be true, otherwise, a ∨ q wouldn’t have been true . (by logical disjunction)
Did you see what just happened? Just by assuming that some statement is both true and false, we were able to prove that a random unspecified statement is true. This same proof can be used to show that the opposite of q is also true. This is called the principle of explosion. If there is even one contradiction in a logical system, every statement in that system ends up becoming a contradiction. This is the reason we adopt the law of non-contradiction. It is a fundamental necessity for preserving the essence of truth.
When the Russell’s paradox was discovered at the beginning of the 20th century, it was described by some as ‘a disaster’. A contradiction was discovered in the prevailing formal mathematics — naive set theory. The contradiction lies in the answer to the question: ‘Does the set R containing all sets which do not contain themselves contain itself?’. If R does not contain itself, it should contain itself by the definition. If R contains itself, it should not contain itself by the definition. R contains R implies R does not contain R. R does not contain R implies R contains R. This is a contradiction. By the principle of explosion, using this contradiction, anything in mathematics could be proven to be true. 1 = 2 is true, 1 ≠ 2 is true, 1< 2 is true, 1> 2 is true etc. At first glance, this looks like the undoing of millennia of mathematical discovery and knowledge. Fortunately, the discovery of the contradiction eventually led to a different system called the Zermelo–Fraenkel set theory which avoided the contradiction and mathematics was rescued from the jaws of a catastrophe. Logic is not easy.
Law of excluded middle: This is the third of the three laws of classical thought. Previously, when we weakened bivalence to adopt trivalence, we let statements have a label ‘unknown’. This does not mean that a statement which is unknown is neither true nor false. All it means is ‘the truth value of the statement is not known at this point’. If we are able to show that a statement which is unknown is actually not false, we would immediately be able to assign the label ‘true’ to it. This means that we had inherently assumed the law of excluded middle. Not adopting this law means that we accept that there can be statements which are not true and also not false. To see the power of this law, see the following proof of the target statement: ‘there exist two irrational numbers a and b such that a^b (a raised to the power b) is rational.’
- Let a = √2 and b = √2
- We know that a and b are irrational
- Let x = a^b = √2^√2
- If x is rational, the target statement is true.
- If x is irrational, let y = x^a = (√2^√2)^√2 = √2² = 2. Since x and a are irrational and y is rational, the target statement is true.
- By law of excluded middle, x is either rational or irrational. So in all the possible cases, the target statement is true, so we have proved the target statement.
The sixth step in the proof can only be considered acceptable if we adopt the law of excluded middle. If we don’t, there may be numbers which are both not rational and not irrational. Which means it is possible that √2^√2 might be one such number. So the above proof doesn’t hold without assuming the law of excluded middle. This law gives us the power to generate more knowledge. It should be clear how discarding this law hampers this ability.
Each law we adopt gives logic more power of knowledge generation while preserving the foundations on which it was built. This process of knowledge generation through logic is called logical inference. But what is it about logic that gives it this unique power to infer knowledge? What do each of these laws do, that cognitive beings using logical systems gain the ability to build a colossal body of cumulative knowledge? Answering this question is of the utmost importance because the knowledge of these basics can help us build new, different logical systems if and when needed. If we come to a realization that the logical systems we currently use are inadequate, and that we need to build better, more powerful, more useful systems, we will need to understand the essence of what gives logical systems their power.
The principle of truth by elimination
Logic gets its power from the ability to eliminate possibilities. Every law we adopt to build a logical system is in the service of one goal and one goal only: To give us the ability to eliminate options. We have already seen how the law of non-contradiction does this: knowing that something is true let’s us eliminate the possibility that it is false. From one statement — x is true — we were able to generate a new statement — x is not false — because the law lets us eliminate the option where x is not not false. The same is true with the law of the excluded middle: knowing that a statement is not true lets us eliminate the possibility that the statement is anything but false. Knowing that the statement ‘raising an irrational number to the power of another irrational number can give a rational number’ is true in both the cases — one in which √2^√2 is rational and one in which it is irrational — we were able to prove the statement to be universally true. We were able to infer the target statement because the law lets us eliminate the possibility that √2^√2 can be anything other than rational and irrational. Take away this law from us, and we lose the power to infer the last statement. Once we eliminate all other options, we have no other option but to label the last option standing as ‘true’. This has been famously explained in one of Sherlock Holmes’ famous lines: ‘How often have I said to you that when you have eliminated the impossible, whatever remains, however improbable, must be the truth?’ (The Sign of the Four (1890): Chap. 6, p. 111. Source). This is the essence of the process of logical inference.
The hierarchy of logical systems
As we add laws to an existing system, we keep changing the nature of the current system of logic in such a way that we can do things which were impossible to do before and find truths which were impossible to find before. Depending on what laws we adopt and the order in which we adopt those laws, we end up with a hierarchy of logical systems — where more sophisticated systems of logic are built on top of simpler systems. With more sophistication comes more power to deduce truths and make logical inferences. But this power comes at a cost, as increased strictness — as a restriction on the kind of things which are allowed to be expressed in the new system. Every time we make the choice of adopting or discarding a law, we face this trade-off between power and expressiveness. So how do we make the choice of whether to adopt a law or not? As the principle of arbitrary choices says, for all we know, the choice is fundamentally arbitrary.
Nevertheless, viewing different logical systems in a hierarchical structure is of great help, as we will see in the future. Here is an example hierarchy which depicts the various choices and the results of those choices:
The identity of a logical system is the laws that are at its foundation. These laws determine what kinds of inferences/proofs/operations can be performed on a statement. But how do we start using a logical system when, at the beginning, there are no statements? The answer is that we cannot. Once we have chosen a logical system, to begin using it we need to start with a few axioms — statements which are true by definition. Without axioms to fuel the engine of logic, the system is non-operational and useless. The set of axioms we choose within the context of a logical system forms an axiomatic system.
It should be noted that in building the proof that ‘I exist’ is true, we first built a rudimentary axiomatic system in classical logic with the following axioms:
- definition of I — The one making the statement
- definition of exist — The state in which an entity is said to be when it is in possession of something else which exists
- There exists a subjective experience of doubt about my existence
- I possess the above doubt about my existence.
The first two logical axioms are lingual in nature. The second two are ontological in nature. With these 4 axioms, we were able to use the laws of classical logic to deduce that ‘I exist’ is true.
There can be infinitely many axiomatic systems within a logical system depending on the axioms chosen. How do we decide which axioms to choose? The answer again, is that it is fundamentally arbitrary from the perspective of the logical system. Each additional axiom forks the current axiomatic system in a way that the new system A' is fully consistent and compatible with the original axiomatic system A (meaning: no statement inferred in A' will contradict any of the statements inferred in A) as long as none of the new axioms contradict the older axioms. This ability to build systems on top of existing systems just by adding non-contradictory axioms is due to the modular nature of logic.
With each axiom added, we gain the ability to infer more truths, but this again comes at a cost — the risk of making the system inconsistent. Every axiomatic system is an axiomatic child of all the systems whose axioms are a subset of the axioms of the system. These parent-child relationships form the hierarchy of axiomatic systems within a logical system. In mathematics, for example, The Zermelo–Fraenkel set theory (ZF) is a parent axiomatic system of ZFC, where only one axiom called the axiom of choice is added to the original axioms of ZF to form ZFC.
From here on in this essay, we will use classical logic as our foundation. Why classical logic? The principle of arbitrary choices.
Not the only Logic, Not the only truth
This is the point in this pseudo-chronological journey we have been tracing together, where it is time to step back and zoom out. We have been able to conceptualize ‘(classical) logical truth’ and build a system we called ‘(classical) logic’ which was designed to have the power to eliminate all options except one and as a result hone-in on a logical truth. We were also able to use this logic we built to perform our first deduction — ‘I exist’ is true — the primordial statement. Make no mistake, for a couple of cognitive beings who have just been born in an unknown environment, this is a huge achievement. Not only have we been able to deduce that ‘I exist’ is true, but we were also able to deduce that we should run away from a tiger we knew was dangerous and as a result of that deduction, saved our lives. One of us was also able to ‘reject’ the statement ‘ghosts are real’ by assigning the truth value of ‘false’ to it and thus were able to save ourselves from a lot of avoidable suffering.
But there’s a problem. We cannot go very far with the kind of logic we have built. To see why, let us continue on our pseudo chronological journey. On Day 1 CE (cognitive era) we saved ourselves from a tiger. Let us say, the same thing repeats on Day 2 CE. And again on Day 3 and so on… We see tigers, and we run, and we live — a rather tiring method of survival. Now on Day 7, after dusk, from behind a bush, we see a tiger strolling along, only to suddenly stop and run the other way after noticing something. This is the first time we see a tiger being scared. This piques our interest, and we go to investigate what it was that scared the tiger. Fire. Now is the time we put our cognitive abilities to use. We are tired of running away from tigers daily, and now we see that fire scares the tiger. Can we use this information and our logic to make our lives easier? For example, can we make sure that we always have a source of fire handy so that we can scare away approaching tigers instead of running from them? Unfortunately, no. At least not without being logically incoherent and, as a result, taking a huge risk. Following is the structure of a potential logical proof which shows that ‘We can save ourselves using fire’ is true.
- Premise: The tiger is scared of fire
- Conclusion: Hence we can use fire tomorrow to scare away the tiger.
There is a huge logical flaw here because of which the premises (axioms) do not entail the conclusion (deduction). This is the flaw: If the tiger is scared of the fire now, how do we know that it will be scared tomorrow also? There is nothing we know to be true currently that lets us make this conclusion. To put it more precisely, the logic and the axiomatic system we have built up to this point does not have the capability yet to eliminate one of the two options we have before us: a) the tiger will be scared tomorrow, and b) the tiger will not be scared tomorrow. So, unfortunately, we have no other option but to carry on with our tiring routine. There is one thing though that we do in addition to following our routine. On Day 8, we carry a burning piece of wood with us. When we see the tiger approaching us, before we begin our daily sprint, we leave the flame where we are standing and then run away. As we run, we look back to see how the tiger behaves. Does it run straight in our direction like it has always done, or does it try to avoid the fire and take a longer path while pursuing us? We follow this new routine for 7 more days. And on every occasion, the tiger avoids the flame instead of leaping over it. Great! But what about now? Can we just use the flame as a defence and avoid running tomorrow? Still, no! Because the uncertainty of how the tiger will behave on the next day is still present. Can we eliminate one of the options available to us? Or are we doomed to a routine of running away from tigers for ever? As we know, our stone-age ancestors, at some point, stopped running away from predators. Once they had fire in their hands, they turned back and faced those beasts. And we also know that this worked out well for us. We wouldn’t have been here if that was not the case. But how did they know that it was true that the tiger would be afraid of the fire?
Axioms, Laws and Truths
Logical axioms and laws help add knowledge to logical systems. Let us say we add the axiom A to an axiomatic system S to make it S'. A potentially adds more true statements to the list of provably true statements in S. But A never undermines or challenges the truth values of statements already proven in S provided A doesn’t break any of the laws of S or falsify/contradict any of the statements in S. S' is a superset of S. And the new true statements in S' are true in the same sense as those in S. This means that given statements in S, along with A, you can use the laws of S to reach any other statement in S'. Axioms and laws enhance logic, but in slightly different ways. Axioms add arbitrary true statements to the system. On the other hand, laws add rules and constrains which dictate how truths can be handled in the system. This might be the reason why adding axioms feels like loosening the system, where as adding laws feels like tightening the system. In spite of this perceptual difference, laws and axioms both serve a common purpose — to enforce and enhance the power of the principle of truth by elimination.
Axioms can also be used to define new entities in a logical system. These entities interact with the existing entities in the axiomatic system to create new statements. For example we used axioms to define the language-entities ‘I’ and ‘exist’ before we were able to prove that I exist is true.
Interestingly, there is a lesser known and under-appreciated way of enhancing logic. It is to embed a logic within another logic. Apparently this is a somewhat unexplored territory in the philosophy of logic. Today we tend to look at different logics as independent systems. But doing this is to our own detriment, as we will see. How do we embed logics within logics? By inventing truths.
Hierarchical Logical Multiverse
As we saw earlier, we hit a road block with classical logic. Even with multiple experiences of seeing tigers avoiding fire, we were not able to translate those experiences into classical logical truths. We just didn’t have the necessary tools. There is nothing objectively wrong about this situation per se.There is no law of nature which mandates that past experiences should lead to deducing truth values. But the way natural selection shaped us, we humans are organisms which learn from our experiences. We experience a subjective desire to learn from history. This desire has helped us survive and thrive. So, subjectively it feels like there is knowledge to be derived from past experiences. If we wish to respect this subjective experience and take it into account while learning more truths about the universe around and within us, we will need new tools.
The law of perfect induction: If event E has occurred every time situation S arose in the past, then the statement ‘event E will occur again when the situation S arises’ is true.
We add the above axiom to our axiomatic system to capture our subjective desire of learning from past experiences. It will not take long for us to realize that this axiom is not acceptable. Because if we adopt this axiom, it would mean that we could have concluded immediately after the first sighting of the tiger and the fire that the tiger would be afraid of the fire the next day. But we didn’t do this. We experimented. We recreated the situation multiple times and as we were able to replicate the situation and the event consistently, our confidence grew. This indicates that this law has at least two practical flaws. To know if in a situation S, event E will occur or not, we need to have a record of at least one previous time when situation S occurred. To be able to do this accurately, we need to be able to read real world situations to a level of precision beyond which, further precision will not affect the occurrence of event E. Unfortunately we do not live in such a universe. Even if we did live in such a universe, the second flaw remains. What constitutes the description of a situation S? Is it just the state of matter and patterns proximate to the location of event E? How proximate? Or does it constitute the complete state of the universe? What if the location of event E is different or is unknown? If we go into enough detail and precision, given the number of variables involved, it might be true that every situation occurs only once in a universe. Every future state of the universe has at least a tiny variation to previous states. In such a universe, this law is all but useless.
Thus, we reject the law of strong induction. We need to design an axiomatic system which works in arbitrarily complex universes in which our perceptual capabilities of reading situations is limited. In such universes, we run experiments multiple times to learn from experiences, and so we need axioms and truths which capture our varying confidence in the truth value of a statement.
The axiom of real numbers: We define real numbers
With this law we have a particular kind of mathematics at our disposal.
The laws of fuzzy truth:
- A statement can have a fuzzy truth value.
- The fuzzy truth value of a statement is a real number between and including 0 and 1.
- The fuzzy truth value of all statements with not experimental evidence is 0.5
With this set of laws, we have built a kind of fuzzy logic. In this system, as the fuzzy truth value of a statement approaches zero, our confidence that the statement is false increases. As it approaches 1, our confidence that it is true increases. If we are completely unsure of the truth value of a statement, its value if 0.5.
The law of imperfect induction: If event E has occurred in a set of situations S' each of which has a varying degrees of similarity to situation S, then the fuzzy truth value of the statement ‘event E will occur again in situation S’ increases with the degree of similarity between each of the situations in S' and S and the frequency with which the event E coincided with the situations in S'.
When you take this law and a few more consequences of this law into consideration and address a few open issues (like: if an event has never occurred, will it never occur? What is similarity?) and add a few more axioms to address these issues (Like probability, etc), we end up constructing the field of statistical inference. Deriving fuzzy truths using the axiomatic system of statistical inference gives us statistical truths. Add a few more axioms like that of Bayesian probability and we get a more refined version of statistical inference called Bayesian logic. Deriving fuzzy truth values within Bayesian logic gives us Bayesian truths.
Let us pause for a moment. This new axiomatic system seems amazingly powerful. It is able, in a complex and uncertain reality, to help us derive truths. So before Day 7, the classical logical truth value of “the tiger is afraid of fire” was unknown and so was its fuzzy truth value i.e 0.5 (due to lack of experience). But as the days progressed, the fuzzy truth value moved up from 0.5 and started approaching 1. We have started gaining new knowledge, where it was impossible before! Thanks to the power of logic.
It is important, however, to note that though we gained new knowledge in the form of fuzzy truth values, the classical truth value remained unchanged. This means that we have learnt nothing new from the perspective of the original conception of ‘truth’ that we had. Is this a problem? There is no reason why this should be. We by-passed the limitations of the original simple logical system of classical logic by inventing a new truth and new logic. Even though these new kind of truths (fuzzy, statistical, Bayesian etc.) are different from our original conception, they are truths nevertheless and should be treated that way. They are just different ways of making something “acceptable” to us cognitive beings.
When we started on this pseudo chronological journey, we started by inventing a concept called ‘truth’ (more accurately, classical truth) and used logic to infer more such truths. The above exercise of building a new truth and a new kind of logic shows us that this action need not be a one-off activity. We can continue doing this and build completely new logical universes with multiple other logical systems (depending on the laws) and axiomatic systems (depending on the axioms) within it. The hierarchy of logical systems we saw earlier in Figure 1 all belonged to the same logical universe— the universe of classical logic. The basis of this logical universe was the concept called classical truth. A logical universe is based on a particular conception of truth which forms the foundation of that logical universe. Within a logical universe, we can add axioms which define new, different types of truths and rules of inference and these together create a new logical universe embedded within the parent universe. In this new universe, new knowledge is generated. This process can theoretically be never-ending.
The classical logic we have been talking about till now is more commonly (and more accurately) called “deductive logic” because we use the law of deduction to create knowledge. The fuzzy logic we have been talking about is more accurately called “inductive logic” because we use the law of (imperfect) induction to create knowledge.
The hierarchy of logical universes and inter-universe interaction
Contemporary wisdom does acknowledge the existence of multiple logical universes like deductive and inductive logics, but treats them as independent parallel universes. This essay tries to provide an alternate conception that inductive logic is actually embedded within deductive logic. To bring this idea into the common knowledge of the public sphere is the main motive of this essay.
We have seen how rules of deductive inference and classical logic are used to ascertain deductive truth values of statements. Similarly, rules of induction like the law of imperfect induction are used to ascertain inductive truth values of statements. But it is important to ponder how these two universes interact with each other. To start off, we need to decide if they need to interact with each other in the first place.
We can, by design, prevent the deductive and inductive universes from interacting with each other at all. In such a scenario, the whole of our knowledge would be divided into two unrelated clusters: statements with deductive truth values and inductive truth values. We will not be able to use the law of deduction on inductive truth values. Neither will we be able to use the law of induction on deductive truth values. Because of this, though a statement might have both a deductive and inductive truth value, there would be no correlation between the two. Also, a logical argument cannot have both inductive and deductive statements. Which means, even the set of all logical arguments is divided into two non-overlapping sets. There is nothing inherently wrong with such a design except for the fact that we can actually do better than this at generating new knowledge.
By completely isolating the two universes from each other’s influence, we are not fully utilizing the power of these logics in conjunction. This will become clear when we consider the following example. We have a six-faced-die 🎲 in our possession. We need to ascertain the truth value of what will show up when the die is cast. This kind of problem is out of reach of deductive logic. To be able to use deductive logic to determine what number will show up, will need to know the value of every relevant physical parameter of the universe which can affect the die’s final state, including every law of nature, every physiological property of the person/machine which is casting the die. Lack of such knowledge is what justified the invention of inductive logic. Inductive logic is well equipped to handle such a problem, but it needs the knowledge of past experiences to work with. Let us take one of the most advanced forms of inductive logic — Bayesian logic, for example. The central axiom of Bayesian logic is called the Bayes’ theorem:
- A and B are events and P ( B ) ≠ 0.
- P ( A ∣ B ) is a conditional probability: the likelihood of event A occurring given that B is true.
- P ( B ∣ A ) is also a conditional probability: the likelihood of event B occurring given that A is true.
- P ( A ) and P ( B ) are the probabilities of observing A and B independently of each other; this is known as the marginal probability.
In the problem at hand we have to compute probabilities of 6 events: the die showing 1, 2, 3, 4, 5 and 6. Let us call each of these events A1 through A6. B is the event of casting this particular die. Let us start with A1. We need to compute the probability of 1 showing up when we cast this die i.e P(A1|B). For this we need three values:
- P(B | A1) : What is the probability that we have cast the die given 1 has shown up? The answer is 1 by definition.
- P(B) : What is the probability that we cast the die? This is also 1 because we have already decided to cast the die and definitely will cast it.
- P(A1) : By the third law of fuzzy truth, “The fuzzy truth value of all statements with no experimental evidence is 0.5”. So P(A1) is 0.5
The same analysis can be performed on A2 through A6 and we will get the same values. P(A2) = P(A3) = P(A4) = P(A5) = P(A6) = 0.5. This situation can be simulated assuming that in the first six throws (imagined experiments) each number showed up 3 times, which is consistent with the above prior probabilities. Now we run the experiment 54 times giving us a total sample size of 60 experiments. Below is how or knowledge of the inductive truth value of each of the events evolves.
As we can see we learn over time that the probabilities of the events approach somewhere around 0.2. It takes around 25–30 experiments to approach this number. Now let us try something different. Let us try and use deductive logic to infer the prior probabilities (truth values) of the 6 events. Since the six events are mutually exclusive (no two events can overlap in a single throw), and mutually exhaustive (no other events are possible), we can deduce that the sum of the number of times each of the six events occurs should be equal to the total number of experiments. That means that in the imagined set of 6 experiments, assuming equal chances for each event, each number shows up once. This means that the prior probabilities would be P(A2) = P(A3) = P(A4) = P(A5) = P(A6) = 0.1666. With this set of priors, the evolution of the truth values with the same experimental results above would look like this:
It is easy to see the difference. The values of the events are already around the point of convergence. Even though this is true in this case because the die is a fair one, it can be shown that even with an unfair die of unknown bias, the second set of priors would approach the truth faster on average. This was just a case of throwing a die to gain knowledge. In the real world, experiments can be much more costly to perform. If each experiment cost $100,000 and we planned to run 60 such experiments, choosing priors intelligently, we could have reached the correct answer 30 experiments earlier — saving 3 million dollars.
Allowing logical universes to interact can help us gain knowledge more efficiently, and this might possibly be the only way to gain some kinds of knowledge. In the above case we allowed the rules of deductive inference to manipulate the truth value of inductive truths. The exact mechanics of how deductive laws affect inductive truth look something like this: To allow for cross-universe interaction, we can slightly alter the properties of inductive truth as follows:
- Those statements and only those statements whose deductive truth value is unknown can have an inductive truth value strictly in between 0 and 1.
- All deductively true statements have inductive truth value 1 and vice versa.
- All deductively false statements have inductive truth value 0 and vice versa.
This new formulation binds the two truth values with a specific correlation and has the added perk of allowing statements which solely existed in the deductive universe to enter the inductive universe and take part in the Bayesian drama. The Bayes’ theorem requires that events or statements have some numeric inductive truth value, and reformulating the definition of inductive truth lets all knowledge which existed before the inductive universe was created to take part in this new universe via the Bayes’ theorem. Allowing the embedding universe to interact with an embedded seems to make intuitive sense: given that inductive logic itself is defined within the semantics of deductive logic, it seems natural to give rules of deduction the power to deal with inductive truth values even though they are different from deductive truth values.
The reverse, however, cannot be allowed. Rules of inductive inference cannot affect the deductive truth values of statements. To understand why such a restrictive law is needed, we will need much more advanced tools of logic and meta-logic which we do not have access to at this point. So, for now, we will have to do with an over-simplified justification. Think of it this way: When deductive logic was built, we clearly defined the exact ways in which deductive truth values could be manipulated by the laws of that logical universe. The soundness of this universe and the veracity of the knowledge in this universe is predicated on the sanctity of these rules of inference. When we start building a new universe like the inductive universe, we have access to both the laws of deduction and induction and define exactly how these laws work on the newly defined truth (inductive truth). This cannot undermine the soundness of the underlying deductive universe. But if we allow the newly minted rules of induction the power to change truth value of the very statements which were used to build this new universe, it has the potential to completely destabilize the system (introduce contradictions which are hard to find or fix). Such a logical multiverse could itself amount to a huge multiversal-logical-fallacy leading to mind-numbing paradoxes. One of the consequences of allowing such systems can be seen in the ‘unexpected hanging paradox’. This essay has a detailed analysis of why the paradox arises and how to prevent/resolve it. However, to rigorously decide whether we should keep the flow of influence of logical universes strictly upward (from root universes to embedded universes, or in other words, from parent to child universes), we will need to develop a much deeper meta-logical understanding of logics. But given that we don’t have the tools to make such a consequential decision, it would be wise to choose the safer option — that of not allowing child universes to affect truth values of parent universes. This uni-directional nature of logical influence is what makes the multiverse a hierarchy. Without this, the multiverse will be a bunch of universe interacting with each other on all possible combinations creating a hotch-potch of knowledge inter-dependence of which it could potentially become really hard to make sense.
But for now, it will suffice to say this: Logical universes like deductive logic and inductive logic need not exist in isolation. They can build on another and can influence each other in specific ways. We will later see how establishing this statement that building such hierarchies of logical universes is both theoretically sound and practically desirable is an essential prerequisite not only to building a moral framework that helps organize groups of conscious cognitive beings, but is also a fundamental necessity for any civilization of conscious beings to make sense of the reality they find themselves in.
Humans are beings with cognition. Cognitive beings have memories. Memories can store statements. Cognitive beings might have a desire to classify the statements in their memories into acceptable and unacceptable. True is the label we give to acceptable statements. False is the label we give to unacceptable statements. The sum of all the statements with their truth values is called the cumulative knowledge of the cognitive being. The act of assigning truth values to statements is called gaining knowledge. We can either gain knowledge by assigning truth values to statements individually and in an ad-hoc manner, or we can create a system which can automate and expedite the process of gaining knowledge by using some statements about the correlation and relationships between other statements to generate more statements in accordance with some laws. Such a system is called logic. When building logics we need to make some arbitrary choices (arbitrary with respect to the logical system being built). This is called the principle of arbitrary choices. Different choices lead to different systems and different systems have different abilities to generate knowledge. These choices come either in the form of laws or axioms. Every new law creates a new logical system. Every new axiom creates a new axiomatic system within the same logical system. Both axioms and laws serve the purpose of providing tools to enhance knowledge. They do this thanks the principle of truth by elimination. The more sophisticated the system, the more knowledge can be generated. One such logical system is the classical variation of deductive logic. Deductive logic is built to manipulate the ‘deductive truth value’ of statements. Specific choices of laws (the laws of classical logic) lead to what we call ‘classical deductive logic’. This logic has its limitations. It cannot deal with the complexity and ambiguity of the real world. Adding no new axioms or laws to this system can help us here because the problem lies in the very nature of ‘deductive truth’ itself. To solve this, within this existing logical system, we use the laws of classical logic to invent a new kind of truth called ‘inductive truth’ which has completely different properties compared to deductive truth. For example, the inductive truth value of a statement is a real number lying in between and including 0 and 1. So now statements can have two truth values — deductive and inductive. Deductive logic gains knowledge by deduction — called deductive inference. Inductive logic uses learnings from experience to induce the truth value of a statement from past information. This way, the inductive truth value of a statement evolves with experience — called inductive inference — for example: Bayesian logic. Each of deductive logic and inductive logic are logical universes. The process of invention and embedding of inductive logic within the semantics of deductive logic is an under-studied and under-appreciated way of knowledge-generation. This is the theory of hierarchical logical multiverse (HLM theory). The structure of a logical multiverse looks like this: There can be multiple logical universes. All logical universes except one(the root universe) are embedded inside exactly one other logical universe (their parent universe) making them a child universe. Each logical universe has multiple logical systems within it which vary in their laws. Each logical system has multiple axiomatic systems within it which vary in their axioms.
This essay aims to understand and appreciate the process of building logics and of creating and embedding logical universes within other logical universes to create a hierarchical multiverse of logic. Having such an understanding forces us humans to view this universe in a specific kind of way. It also forces us to meta-view our point-of-view of this universe — to understand fundamental concepts like reality, cognition, knowledge, truth, logic etc. In later essays we will see how, not just the cumulative knowledge of one cognitive being, but the cumulative knowledge of the whole human species across time can be organized into such a logical multiverse. We will see how such a formulation of knowledge, truth and logic is a necessary step towards building a robust, effective and resilient moral framework for not just the human species, but for any group of conscious beings and arguably, any group of cognitive beings.
Though the motivation of this essay has been to build tools which help us to understand morality better, it was rewarding to see that as I went deeper and deeper into these fundamental concepts, I began realizing that the scope of this philosophical investigation goes much beyond morality itself. Many philosophical conundrums — ranging from the imagined (like philosophical and logical paradoxes ), to the real (like quantum mechanics ) — can be either solved, dissolved or clarified by viewing them through the lens of the HLM theory.
It might come as a surprise to some to find that in a series of essays on morality which started with a prologue justifying the necessity of building a generic moral framework, I followed it up with a seemingly tangential topic of knowledge and logic. But I consider one of my main initial insights in this topic which led me down the path of writing these essays was to realize that humanity’s inability to build an accepted moral framework is not just to blame on moral philosophers being wrong and disagreeing with each other about morality. It is more to blame on the predicament that there is no agreement even on basics like where the disagreement actually lies and also a lack of realization that there might be some fundamental flaws in the very fabric of the system inside which these discussions have been happening that lead to such an impasse. My understanding is that the fundamental fabric of the system is the knowledge we have and the logic we use to expand that knowledge. To find the flaws in this fabric, going deep into these topics is imperative, if we are to have any hope of dealing with as complex and ambiguous a topic as morality (or to even understand where the complexity arises from in the first place).
In this essay I wanted to put forward the HLM theory. I used deductive logic and the embedded inductive logic to show a working example of this theory. The examples of the tiger, the ghost, the fire, the die were all to illustrate how such a multiverse might function. But these two universes form a very tiny part of the whole multiverse. There are many logical flaws, over-simplifications and open-threads left in this essay. For example, though objective reality and subjective experience were touched upon at the beginning, they were not fully explored. This was done deliberately, to avoid over-complicating the explanation of the topic and over-lengthening this essay. Also, it is important to understand that the specific example given — inductive logic embedded within deductive logic — is just one application of HLM. This specific application might be flawed and there might be better ways of building a logical multiverse. But these flaws in the examples should not be treated as flaws in the underlying theory itself without further analysis.
At the beginning, I had decided that I would ground the content of this essay in the two kinds of logic I have known since a long time and I have spent so much time thinking about. But as I was doing research, I encountered a completely new kind of logic which I had never heard of before: abductive logic. This was a small shock and minor disappointment. It gave the feeling that I am writing about logic, about which I know almost nothing about. How can I be sure that I am making sense about logic when there is a whole branch of logic lying unseen in plain sight all these days and all the while I was under the illusion that induction and deduction were the main logical systems which need to be considered as humanity’s tools of knowledge generation. But after the initial shock, as I thought about it further I realized that this is a great test for the theory I was proposing. If the theory has merit it should be able to explain, ground and encompass this new tool of knowledge generation. If HLM is correct, abductive logic should be able to fit snugly somewhere in the multiverse without changing the nature of the deductive and inductive universes as we know them. This was an exciting challenge. But instead of waiting to complete this theoretical exercise and then including it in this essay, I decided in favour of leaving it out for the sake of brevity. That means that this is still an open question. And I leave it as an exercise to the interested reader to figure this out. Can a logical multiverse be built which encapsulates the whole of deductive, inductive and abductive logics without contradictions? Is abductive logic a new logical universe, or is it just a different way of representing one (or both) of the existing universes? If it is a new universe, where does it fit? Is it embedded within inductive logic? Or is inductive logic embedded within it? Or is it completely unrelated to inductive logic? These are questions which you should try to figure out and see if and how the HLM theory answers these questions. In the worst-case, these questions might lead to disproving the theory. I, for one, am hopeful that HLM will stand this test. Though I do not have answers to all the questions yet, once I have them, I will write down my thoughts in a separate essay. Until then, let me know what your answer is!
This essay can be seen as a zoomed out view of the pre-existing system within which the moral dilemmas we find ourselves in on a day-to-day basis are formulated. In the next few essays we will zoom out further from this relatively local view of the multiverse which showed only two universes. We will build a completely new universe (different from deductive, inductive or abductive) from scratch and possibly break new philosophical ground. To do this, we will use similar tools and arguments as we did in this essay (for example: going on a pseudo chronological journey). We will see how this new universe interacts with the existing two universes and we will see why this new universe is necessary and why it is the ideal starting point for our investigations into morality.