Why universality trumps IQ
Bottomline: Universality fairly easily leads to the conclusion that humans anywhere out of the left tail are fundamentally the same, mentally speaking.
Even if you lived under a rock, you have probably heard about the epic attacks on the so-called “intelligence quotient” (IQ) by the inimitable Nassim Nicholas Taleb. Joshua P. Hochschild has a very good summary of the entire affair so far.
I later sat down and thought about the whole question independently from a computational point of view, separate from probability or complexity. I quickly realized that this thing called IQ, whatever it is, doesn’t matter, and furiously wrote a Twitter thread on my thoughts in a fit of inspiration. I knew I was on the right track when anonymous racialist geniuses started attacking with zero substance. This essay is an elaboration of that thread.
In computation, universality simply means a process that can simulate all processes — including itself. By simulation, we mean copying the behavior of a process to as much fidelity as we would like. At some point, if it looks like a duck, quacks like a duck, and walks like a duck, we stop, and consider it a duck for all practical purposes. (There, I wrapped the Turing test for artificial general intelligence in a nutshell for you.) Replace “processes” with “machines,” and you roughly see how computers work: a universal machine is a machine that can simulate all machines, including itself. You can think of a machine simply as a process that transforms an input to an output following a fixed set of rules.
So, this is why iPhones and Androids, or laptops and supercomputers can essentially run the same software, despite their superficial differences in hardware. This is also why you can simulate Windows inside a virtual machine on your Mac, play old video games from Atari on Intel, or mine Bitcoin on a half-a-century-old IBM mainframe. (Do you see in which part of the definition of the universality do virtual machines naturally arise?)
Alan Turing hit upon this critical observation when he designed his machines, which are a precursor to the computer you are reading this on today. When he first devised what we now call Turing machines, he started with the simplest thing to do: one machine to compute specifically, say, any real number you cared about, and this is all that machine could ever do. (Real numbers include integers, fractions, and irrational numbers.) The problem is that there is a countably infinite number of such machines, one for every computable real number. (Countably infinite just means as big as all of the counting numbers you learned in kindergarten.) He quickly realized that if we needed a different smartphone for every app you wanted to run, they wouldn’t be, well, so smart at all, and Apple certainly wouldn’t be as rich as they are now. (I remember when I was a kid, we had different machines for watching movies, listening to music, and playing video games. Now, we have machines that can do all of that, and make espresso for you. Try traveling back in time and telling that to kids in the 80s.) So, being the genius that he was, he also saw the “simple” solution to the problem: one machine to do what anything else, including itself, can do, so long as it had all the space and time in the world. This a really great idea in so many ways, which is why I believe computers wouldn’t be in production today if it wasn’t for how practical Turing was: he fixed his own bicycles, you know.
Really, you can think of a universal machine as this guy:
This is basically what we mean by programming a computer. We are giving a really dumb but universal machine a tedious and explicit series of instructions — software — by which it can now pretend to be another machine. (You see how software really just encodes some hardware now?) The guy, or equivalently, the computer, may not know how to, say, defuse a bomb, but you just tell him precisely what to do. The great Richard Feynman has a brilliant analogy for how universality works in terms of how fast but dumb clerks who know how to only, say, add numbers can yet simulate slow but smart clerks who know how to add and multiply them. (No offense to clerks everywhere.) An interesting exercise is to find the smallest number of instructions (e.g., move data from one place to another) that are Turing-complete; i.e., you can use them to build a universal machine.
You can also think of universality in terms of translation between different languages. You might have heard of various programming languages such as C, Python, Javascript, Go, and so on. Superficially, they all seem very different on the surface. However, while some programming languages are nicer than others for expressing some things (for roughly the same reason some types of cursing feel more satisfactory in some human languages than others), there is no language that is computationally more powerful than another. In other words, there is no programming language that can express something that another cannot. Every useful programming language is universal. Why? Well, think of the universal machine as a translator or interpreter. You can always take a Python program and translate it into an equivalent Go program, and vice versa. This is a useful exercise for the programmers among you.
Universality may sound simple and obvious in hindsight, yet it is anything but. Roughly stated, the Church-Turing thesis postulates that any type of computer you can come up with is computationally just as powerful as those slow, clunky, and deterministic Turing machines. That is, the Turing machine can do exactly everything the other computer can do. (This is what Turing and Alonzo Church found out about the lambda calculus, an independently-developed and much less intuitive model of computation that inspired the LISP family of programming languages.) Note that we don’t care about efficiency here, or how long the Turing machine takes to compute the same thing, although there are variants of the thesis that do care about it.
The Church-Turing thesis does not have a proof, and may not even be provable, but it can certainly be easily disproved. Nevertheless, it has been nearly a century since it was formulated, and a working counterexample has yet to be found. Even though quantum computers are suspected to be faster for some problems (e.g., factoring numbers, which can break some types of cryptography), they are not thought to be more powerful in the sense above. I don’t think anybody really knows why we can’t seem to do any better right now. It could be a temporary limitation due to gaps in knowledge or technology, or, more intriguingly, a fundamental limit imposed by Nature.
Universality is exactly the reason why, believe it or not, the incredibly simple-looking Rule 110 1D cellular automaton is just as powerful as the computer you are reading this on. (Don’t worry about what a cellular automaton means: it’s just a different type of computer: something dead simple, and yet endlessly complex. It is also the first known type of machine designed to be self-reproducible.)
Stephen Wolfram has spent an entire lifetime studying computation in the form of cellular automata. Based on his studies, he has formulated what he calls the Principle of Computational Equivalence, which, at first sight, sounds simply like a restatement of universality (which is what I originally thought):
“The key unifying idea that has allowed me to formulate the Principle of Computational Equivalence is a simple but immensely powerful one: that all processes, whether they are produced by human effort or occur spontaneously in nature, can be viewed as computations.”
That is a very deep implication of universality, by the way. If every physical process in Nature, including human thinking, can be viewed as a computation, then universality means that there is a physical process for simulating any physical process you like, including itself. Computation is really a formalization of mathematics, a way of mathematicizing mathematics itself. So, in hindsight, it is not surprising why mathematics is effective at all at describing Nature. This conjecture answers Eugene Wigner’s question on “the unreasonable effectiveness of mathematics in the natural sciences.” Nature has a built-in universal mirror or chameleon, if you like, that can copy anything you throw at it, even itself. If you think about it, universality is the reason why we can begin to have intelligent life at all. (No, this does not mean that it is trivial to reverse-engineer a program that simulates some object of desire, but that is another story for another day. Also, even if given the program, predicting what it will do is tricky business, as we will see later.)
But, actually, Wolfram goes much deeper than that:
“Because what this principle says is that all programs whose behavior is not obviously simple are actually equivalent in the sophistication of the computations they do. It doesn’t matter if your rules are very simple or very complicated: there’s no difference in the sophistication of the computations that get done.”
In hindsight, this does not sound surprising. We find universality almost everywhere we look in Nature: DNA, Minecraft, PowerPoint, billiard balls, you name it. Some people say that these things are “accidentally” Turing-complete. But it took a genius like Wolfram to notice that it cannot possibly be a mere accident that it is almost trivial to “accidentally” grow universality.
Do you see now why universality trumps IQ?
What Wolfram is saying is that every system that displays non-trivial behaviour — such as the human mind —is equivalent in computational power. In other words, human minds are universal. Every human mind is capable of running exactly the same computations. This means that we are all capable of understanding the same thoughts, emotions, and ideas. And the key to understanding is communication.
People are clearly not the same: we are very different a lot of the time. Yet, we can be the same sometimes, especially when communicating what we mean, whether through the spoken or written word. Synchronicity. And this is all that matters.
Think about it: if the human species depended on exceptional geniuses who nevertheless could never communicate their exceptional thoughts to another human being, then either they are intellectual con-artists (like postmodernist “philosophers”), or we would have been doomed a long time ago. Although a few critical individuals clearly hit upon the right ideas at the right place at the right time, many other individuals need to be able to independently verify and improve upon these ideas. The real intelligence lies in human cooperation. There is no such thing as an exponentially smarter human being for the same reason as there is no such thing as an exponentially taller human being. A genius who cannot communicate his thoughts to another human is, in fact, not a genius!
Jed Trott asked me whether I thought anyone could write Handel’s Messiah. My answer is “yes,” for the following reasons.
First, if you “simply” enumerate all possible musical compositions the same way, in that thought experiment, where enough monkeys banged away at enough typewriters, then you would eventually get the Messiah the same way you would get the collected works of Shakespeare. (Does this remind you of the Library of Babel?) Time reveals everything. The practical difference between Handel and blind enumeration or randomness is efficiency: Handel is much more likely than monkeys on typewriters at producing the Messiah much more quickly because he has more practice, experience, knowledge, and talent, most of which can be cultivated. Enough with this genius-worship: they are not some magical unicorns who flatulate one-off miracles. I strongly believe that there are people who never try to achieve greater things because they are told that they are not like geniuses.
The second, perhaps less facetious reason is that there is such a thing as convergent evolution, both in evolution by natural selection, and in human thinking. Have you ever had a “new” idea in the shower, only to discover to your disappointment later that someone else (perhaps your rival) has already thought about it? There are enough examples of this in history: from Newton and Leibniz to Darwin and Wallace to Einstein and Hilbert, just to name a few. There tends to be many paths to the same destination, and it will eventually be found, especially if the right atmosphere is in the air, it is a profitable endeavour, and there is enough competition to get there.
Back to the importance of communication. If you still have a hard time believing that everyone is fundamentally capable of understanding the same ideas, let me ask you: is there an idea that you believe that one human being is capable of that another human being (barring unfortunate left tail in mental capability) cannot fundamentally ever understand, no matter how much time it took? I am not talking about a deliberate refusal to try to understand. I mean something you cannot even begin to translate: the idea is literally stuck in your head, and it could never begin to make sense to someone else, no matter how much time it took. If you believe this to be the case, then there should be partitions of humanity that can literally never talk to each other, but this is not what we see.
(The bored reader may skip this little, technical digression.)
A short quiz: why is prediction generally impossible?
That’s right: prediction is generally undecidable, and hence impossible, because otherwise the halting problem would be solvable, leading to a self-referential contradiction.
The halting problem sounds dangerously simple: given any computer program and its input, could another computer program decide whether it would ever stop computing? If you have ever waited for that spinning-beach-ball-of-death on the Mac to, well, stop spinning, you know exactly what I’m talking about.
As it happens, it is the first undecidable problem Turing ran into when he designed computers in a systematic and rigorous fashion, back before anyone had ever seen a computer. (Believe it or not, Turing was trying to solve what was an open problem in mathematical philosophy, but that’s a different story for another day.) Undecidability simply means that computers that we know how to build can never solve the problem, not even in principle, no matter how much space, time, and money we have. Exactly why the halting problem is undecidable is a bit of a long story, but suffice it to say that if a computer could solve this problem, then you would be able to build another computer that halts if and only if it doesn’t halt! (Some of you may have seen this problem elsewhere.) So, unless you believe we live in the kind of Reality that permits for the construction of living, breathing paradoxes, you would need a more powerful type of computer that we are not even sure could be built in order to solve the halting problem for the computers that we do know how to build… but guess what? Those computers wouldn’t be able to solve their own halting problem for precisely the same reason! And so you would need what I like to call “a transfinite hierarchy of turtles all the way up.” A generalization of the halting problem, called Rice’s Theorem, explains why there can never be such a thing as, say, the perfect antivirus software. (Anyone who tells you otherwise is a snake oil salesman.)
But what does undecidability have to do with prediction? Well, the great Gregory Chaitin observed that, since you cannot predict whether a Turing machine would ever halt or not, you cannot predict what the next output from a Turing machine would be (and so you cannot actually use diagonalization to compute an uncomputable real number, and build a living contradiction, but never mind what that means here).
You see, computation is more fundamental than probability or complexity. To make the prediction business worse, there’s something else called computational irreducibility (related to but not the same as intractability), which simply means that there are generally no shortcuts to answers even if you could compute them, but let’s not even get there right now.
So, if we cannot generally predict what relatively “simple” sequences of numbers might do, what makes us think we can predict what vastly more complicated and complex human beings would do? This is yet another nail in the coffin for IQ.
These are some questions people have asked me. I might expand this over time as I field more questions.
Q: “Are you saying I’m just a computer?”
A: No. Humans are not necessarily computers, but there is no good reason yet to think that we are not computable. If we believe that all of Reality is computable — or capable as being understood fundamentally as processes that follow rules, whether simple or complicated— then why not human beings? The fact that complex phenomena cannot be understood from the point of view of naïve reductionism — because there are layers of emergence that cannot be understood as a simple linear sum of their parts — doesn’t mean that the conjecture that everything is made of atoms is untrue. Remember Wolfram’s findings: endlessly complex things can emerge from a few extremely simple rules. There is no reason to think that we are not material like galaxies, clouds, or fish, although of course, qualitatively, we are very much unlike them.
This question always sounds to me dangerously close to those who find evolution by natural selection hard to swallow on the basis of, “Good sir, are you suggesting that I descended from but mere apes?” Exactly how does it denigrate us to suggest that we emerge but from atoms? As Feynman said on the subject, does a physicist see less or more beauty than a poet?
Never mind that there’s a whole bunch of misconceptions about how computation is just “mechanical,” rigid, uncreative, and dumb — completely unlike human thinking. Turing designed computers so as to capture human thinking, especially mathematicians, who seem to think that their enterprise is uncomputable, but tend to miss that the same logical limits that apply to computers should also apply to them, for the same reason that gravity affects everything and everyone.
Look, forget about whether we’re more powerful than Turing machines or not. The real question is: are we universal or not? Universality works on all types of machines, for the same reason that it works on Turing machines.
Q: “Are you saying that everyone is the same?”
A: No. People clearly should and do think differently. We are not ants. However, what I’m trying to say is that we can understand the same ideas, even if we think differently. And do you see that we do this by explaining our thinking to each other?
So, why doesn’t everyone think the same? Think of Nature as using individuals to perform a gigantic distributed computation that spans over the oceans and generations. Barring some ultimately inconsequential differences in hardware, we are all universal, because basically Nature cannot help it anyway (see the Principle of Computational Equivalence above). The much more powerful variable is software. Different people solve different problems. Nature wouldn’t have bothered with individuals if individuality didn’t matter. However, our species would have failed unless we could explain our findings to each other.
Jaffer Ali: “Our ability to understand each other is not complete. There is no perfect transmission of information, knowledge or wisdom.”
A: Yes, noise is everywhere, including in explanation and interpretation.
Two things:
- You don’t need perfect transmission to get the idea across.
- Error-correction is key (and we have higher-fidelity media now like the Internet).
- Errors are important sources of mutations.
(See how you understood me despite a deliberate, small typo above? QED.)
Q: “What about processing speed, memory capacity, and so on?”
A: Ultimately inconsequential due to universality. The gods gave us the ultimate equalizers — pen, paper, and computers — for a reason.
Q: “What about the mentally challenged?”
A: It seems to me that these poor souls have higher IQ scores than anyone who seriously asks this question. In any case, as Nassim has pointed out, it is the dharma of the strong to protect the weak.
Bottomline: universality fairly easily leads to the conclusion that humans anywhere out of the left tail are fundamentally the same, mentally speaking.
If there is anything that we can learn from universality, it is the vastly underrated importance of teaching. Teaching is a way to “program” someone else to understand what you mean, and go further. There is no one correct, best way to learn something, but this is not how we are generally taught all the way from kindergarten to doctorate diplomas. Curiosity is stifled, if not outright butchered, the higher up we go. Students may be punished for asking questions, or answering out-of-the-box (see Chapter 17 for a happy exception). I do not blame all teachers; some of them are really trying hard, but the system is broken for reasons that will take another essay.
The creative differences between individuals is precisely why you cannot use standardized tests, exams, and metrics — least of all, IQ, whose statistical validity Nassim has thoroughly busted — to measure people. Some students gabish fast, some students gabish slow. If students gabish slowly, consider that the ruler should be measuring the teacher instead of the student. In my book, you are a genius if you can teach something to someone the diametric opposite of you has never seen before.
As an individual, what matters much, much more than your alleged IQ is what you do with your precious, limited time on Earth. Remember, universality says that we are all capable of exactly the same ideas. That is why even differences in human languages don’t really matter. (Whether or not the Sapir-Whorf hypothesis is true, we get for free the result that it is ultimately irrelevant.) Remember, the insidious thing about IQ — as Nassim astutely observed with his owl eye— is that there are people who fancy “their” people genetically smarter than yours, and only want to “help” you. (They are often the same people who like to mistakenly think that the “West” discovered all civilization, and that the “West” is Nordic / North Atlantic / North Europe.) At best, they are overeducated idiots; at worst, they are racialists. No matter what anyone tells you, you can learn about anything you like. Go out, and find out what you are good at, what Nature put you here to discover, and teach the rest of us.
So, who should care about IQ? Nobody! Why? Because we are universal!
June 7 2020: There is a followup to this essay.
Acknowledgements: thanks to Brendan Dolan-Gavitt, Lorenzo Villa, Sean McClure, and Robert J. Frey for finding mistakes.