Resolving the Scandal of Deduction Through a “Copernican Turn” Towards Cognitive Science.

Timothy Brown
11 min readJul 29, 2023

--

In any deductively valid argument, the conclusions must already be implicit in the premises. From this, it follows that no new information is generated by deterministic computation in the form of logical operations, nor by deduction in general.¹ On the face of it, this widely accepted proposition, the “Scandal of Deduction,” seems ridiculous. As J.S. Mill put it, “a person must have made some advances in philosophy to believe it.”² Deductive arguments appear to carry information; they reduce our uncertainty. Data analysis software seems to tell us new things about our data, etc.

Prior attempts to analyze this disconnect have tended to focus on less popular semantic theories of information as a means resolving the problem,³or introduced new formal distinctions such as surface versus depth information⁴, hard versus soft information,⁵ virtual information,⁶ etc.

In this brief article, I will instead show how the problem can be explained entirely within the framework of Claude Shannon’s foundational theory of information. We can resolve the “Scandal” by recognizing that Shannon’s theory of communications has a Cartesian Homunculus hiding in plain sight.

Claude Shannon’s Mathematical Theory of Communication

Shannon’s theory shows how information flows from a source to a destination. Information is transmitted across a channel. The channel is the medium through which a signal passes from the information source to a receiver (e.g., between a physical system and a measuring device). The receiver then “reconstructs the message from the signal,” making it available to a destination, “the person (or thing) for whom the message is intended.”⁷

Figure 1: Shannon’s schematic diagram of a general communication system (Shannon, 1948)

Resolving the Scandal of Deduction

The key to resolving the Scandal of Deduction is to realize that computation itself, and the activities of the human brain, all involve ongoing communication.⁸ We tend to think of the “destination” in Shannon’s theory as a unified entity, but this is only true in the abstract. We could appropriately think of the text of this post as the signal in Shannon’s model, the space between our eyes and the screen as the channel, and our eyes as the receiver, but this is not the whole story.

Our experience of sight is the result of an extraordinary number of neurons computing and communicating together. We could as well talk of electrical activity down the optic nerve or the collective action potentials of some set of neurons in the visual cortex as the “signal,” and various other areas of the brain as the “destination,” depending on how we choose to analyze the process of reading.

Indeed, the activities of a given neuron (or group of neurons) involved in reading a message could plausibly be mapped to any of the five components in Shannon’s model of communications (see Fig. 1) depending on how we choose to analyze the process. This is a key point we shall return to in later posts, as the sense in which mapping physical phenomena to different parts of Shannon’s model is inherently subjective has major implications for the philosophy of information.

Likewise, in the archetypal example of a Turing Machine there is communication after an input has been fed to the machine. The head of the machine has to read the tape. Nothing in the Turing Machine is a “destination,” that receives the entire output at once. The process of understanding a message, even if paired down to the simple “understanding” of digital computers, is necessarily emergent. No single neuron understands a message and no single logic gate does computation alone.⁹ In the same way, “memory” systems can be seen as a form of communication between past states of a system and its future self.¹⁰

The information entropy of an original message is not changed by computation; information sets an upper bound on the maximum reduction in uncertainty a recipient can receive from that datum alone. However, the “Scandal” only exists if we assume all the information required to generate “meaning” exists in the physical signal of some message. This is obviously false, rocks do not gain information about logical truths if we carve them into them, meaning is not contained in external signals alone. (But neither is there good evidence to support the claim that meaning is necessarily somehow on the “mind side” of some sort of dualist dividemore on this will follow in a future article).

A “paradoxical” staircase optical illusion. Which way the “stairs go up” is constructed by the sensory system.

To see why meaning cannot be contained within external signals, consider a program that randomly generates all possible pages of text using the keys on a keyboard. This program will eventually produce every page that will ever be written in English (plus a much larger share of gibberish). The pages of a paper on a cure for cancer published in a medical journal in the year 2123, the pages a proof that P ≠ NP, a page accurately listing future winning lottery numbers, etc. are all outputs of this simple program.¹¹

Would it make sense to mine the outputs of such a program, looking for a cure to cancer? Absolutely not. Not only is such an output unfathomably unlikely, but any paper that appears to be describing a cure for cancers is highly unlikely to be actually useful. Why? Because there are far more ways to give coherent descriptions of false cures for cancer than there are descriptions of effective treatments, just as there are more ways to arrange the text in this article into gibberish than into English sentences.¹²

The point of our illustration is simply this: the outputs of such a program do not contain semantic information. The outputs of the program can only tell us about the randomization process at work for producing said outputs. Semantic information is constructed by the mind. The many definitions of information based on Shannon’s theory are essentially about physical correlations between outcomes for random variables. The text of War and Peace might have the same semantic content for us regardless of whether it is produced by a random text generator or by Leo Tolstoy, but the information theoretic and computational processes undergirding either message are entirely different. (This has some interesting implications on how to view literature produced by chat bot programs, which we shall explore in future articles.)

Why this solution is not just an appeal to psychologism

At first glance, this solution to the “Scandal,” may not seem that different from other solutions that appeal to psychologism. Crucially however, this solution is not susceptible to Hintikka’s criticism of semantic measures of information that are “not effectively calculable.” Our solution merely requires that we recognize the complex systems underlying the construction of semantic information from messages — that we acknowledge that we “bring information to the table,” in the process of understanding and judging. This solution simply states that semantic content should be analyzed using the information theoretic techniques already in use in cognitive neuroscience,¹³ rather than attempting to bypass this complexity with appeals to a direct formalization of semantic information simplicitier.

Crucially, this solution does not apply only to human beings’ processing of information. It applies outside “psychological” contexts. To see why, consider that, in many cases, it is straight forward to write a program that will eventually solve NP problems (problems that it takes computers a long time to solve). For example, finding the shortest path between a finite set of points, the traveling salesman problem, can be “solved” by a brute force algorithm that combinatorially adds up all unique paths and then sorts them by length, outputting the shortest one. Such an algorithm, when paired with its input, entails its output with a probability of 100%.

Thus, it appears that no new information is generated during computation. However, just as obvious is the fact that uncertainty about which route is the shortest is not reduced until the computation is finished, which could take any length of time given a sufficient number of nodes.

Similarly, consider an efficient algorithm, EA, versus an inefficient algorithm, IA, where both find the same solution to certain type of problem. For any given input (I), we would tend to say: EA(I) = IA(I) = O (the output).¹⁴ However, in important ways the two are not equivalent.

Likewise, “4 + 4” and “20 – 12” are both expressions equal to 8, yet they are not the same computation. This is important to consider vis-à-vis claims that physical systems, or the universe itself, “is a computer.”¹⁵ Indeed, the fact that simple equations seem to be identical with their outputs may simply be due to the way in which most of the computation involved in creating our conscious perceptions is hidden from us. In reality, simply seeing and recognizing “4+4,” “=,” and “8,” and the relationships between the symbols, likely involves millions of neurons working together, a point covered in detail in this post. (We shall have more to say about this in a later post about the implications of this solution to the Scandal of Deduction for the philosophy of mathematics and metaphysics).

Towards the Copernican Turn

Does the Philosophy of Information need a “Copernican Turn” like the one Kant forced on metaphysics?

Our argument against semantic theories of information is that, for all their formal complexity, they simply ignore the ways in which cognitive science says we actually process information from the environment. Semantic information cannot be easily defined in terms of “sets of possible worlds consistent with some message,” because the semantic information conveyed in a message varies by recipient.

When we understand messages, the initial signals we receive are combined with a fantastic amount of information stored in the brain before we become consciously aware of a meaning. The information that signals are combined with in the brain varies by person and it varies according to the amount of cognitive resources that we are able to dedicate to understanding the message. “Understanding,” is itself an active process that requires myriad additional communications between parts of the mind and the introduction of vast quantities of information not in the original signal.

This is why listening to someone read a passage while you are distracted versus meditating on/completing a “deep read” of that same passage can result in taking an entirely different meaning from the exact same signal. The semantic content of messages is thus determined not only at the individual level, but at the level of the individual instance of communication. Physical information processing cannot be abstracted away from essential elements of being.

The amount of resources we commit to understanding a message is dictated not only by conscious processes (i.e., executive function), but by unconscious processes. If we are angry, hungry, intoxicated, etc., the ways in which we understand are affected in ways that we are not aware of. Other ways in which our interpretation of signals is affected unconsciously is through social conditioning, norms, etc. So here, post-modernism has a point about how culture interacts with experience and meaning. This is not true simply of language, but of all of phenomenal existence, as all incoming sensory data is itself a “message.” Anyone who has learned a second language is aware of how the automatic nature of imbuing messages with semantic content is itself a learned skill.

(Both cognitive science and continental philosophy have much to say about how unconscious processes shape perception and semantic meaning, a topic we shall return to in future posts.)

Further, messages do not exist as isolated entities in the real world; when we receive a message it is necessarily corelated with other things in the world. Indeed, Shannon Entropy can only be calculated in terms of the number of possible outcomes for some random variable in a message, but knowledge of which outcomes are possible, and their probability distribution, requires background information. This gets at the essentially subjective elements of information noted above vis-à-vis mapping physical systems to components of Shannon’s model of communications.

It is the combination of these two factors:

  • The way in which our natural faculties shape our interaction with and conceptualization of information, and;
  • The essentially subjective nature of some elements of information,

…that, to my mind, suggest that the philosophy of information may need to undertake a “Copernican Turn,” of the sort that Kant forced onto metaphysics. That is, the abandonment of the study of unknowable information-in-itself, replacing it with an inquiry into the world-of-appearances and the innate structures of both mind and information that determine the nature of phenomenal appearance.

More could be said here, and will be said in future articles, but that seems like enough to consider for now. Thank you for reading, I will try to respond to any comments.

Notes and Citations:

  1. For an early exploration of this “scandal of deduction”, see: Hintikka, Jaakko. “Information, Deduction, and the A Priori.” Noûs. Wiley. Vol. 4, №2 pp. 135–152. (1970) Link; For a more comprehensive overview of this problem through the lens of philosophy of information, see: Allo, Patrick. “The Logic of Information.” The Routledge Handbook of Philosophy of Information. Routledge, London, UK., pp. 59–61. (2016)Link
  2. Mill, J. S. 1882. A System of Logic Ratiocinative and Inductive, 8th edn. New York: Harper & Brothers.Primiero, G. 2008. Information and Knowledge. A Constructive Type- Theoretical Approach. New York: via D’Agostino, Marcello. “The Philosophy of Mathematical Information.” The Routledge Handbook of the Philosophy of Information. (2016) Link
  3. For an analysis of several proposed solutions of this sort see: Bremer, Manuel E. “Do Logical Truths Carry Information?” Minds and Machines volume 13, pp. 567–575. (2003). Link
  4. Hintikka, Jaakko. “Surface Information and Depth Information.” Information and Inference. Springer., pp. 263–297. (1970) Link
  5. Allo, Patrick. “Hard and Soft Logical Information.” Journal of Logic and Computation, Volume 27, Issue 8., pp. 2505–2524. (2017) Link
  6. D’Agostino, Marcello. “The Philosophy of Mathematical Information.” The Routledge Handbook of the Philosophy of Information. (2016) Link and Floridi, Luciano & D’Agostino, Marcello. “The enduring scandal of deduction Is propositional logic really uninformative?” Synthese volume 167., pp. 271–315 (2009)Link
  7. Shannon, Claude. “A Mathematical Theory of Communication” The Bell System Technical Journal. Vol. 27., pp. 379–423 (1948) Link
  8. As Paul Broderick notes in “On Communications and Computation,” computation and communications are “often not conceptually distinguishable.” Broderick, Paul Bohan. “On Communication and Computation” Minds and Machines volume 14, pp. 1–19 (2004) Link
  9. To be sure, a lone logic gate can encode a yes/no message, but something always needs to measure this state for it to have relevance in any outside context. When information is defined by correlation, it is essentially relational.
  10. Even Markov Processes may posses a sort of “implicit,” memory in this sense, via recursion.
  11. This is why Kolmogorov Complexity must be understood as the shortest description of an entity and only that entity. Otherwise, the shortest description of most objects would simply be an algorithm that combinatorially churns through all possible binary strings of finite length.
  12. This of course assumes that you don’t already think this paper is gibberish, a bit of a conceit on my part ;-)
  13. For examples see: A Tutorial for Information Theory in Neuroscience | eNeuro
  14. Thus, mathematical objects are distinct from any individual encoding of said object, another avenue for discussion. See: Barry Mazur’s “When is one thing equal to some other thing?
  15. For examples see: Paul Davies’ “Information and the Nature of Reality;” Vlatko Vedral’s “Decoding Reality,” David Deutsch’s interview “Is the Cosmos and Computer,” or Max Tegmark’s “Our Mathematical Universe.”

--

--

Timothy Brown

Fiction author, philosopher, climber, former city manager, and consultant.