On Memory, Computation and Learning

Sai Sasank
Research Escapades
Published in
7 min readSep 29, 2020

We don’t have a widely accepted definition of intelligence. Nevertheless, for the sake of discussion, let’s try to have an utterly inclusive and informal description of intelligence as the ability to accomplish complex goals. There are no limits to what these goals can be. This definition is inclusive in the sense that it doesn’t limit us only to biological organisms. It also doesn’t require consciousness.

Memory, computation, and learning are necessary aspects of intelligence. One might argue that there are other abilities that are also necessary, however, I claim that these three are sufficient. To justify briefly, you must have memory because you can’t remember otherwise. You must be able to compute by utilizing the memory and hopefully calculate and remember useful things. Finally, for any system to be able to adapt and improve, it should be able to learn. Fundamentally, these capabilities are necessary and sufficient to give rise to intelligent behaviour.

Let us see how these phenomena — memory, computation, and learning, can be realized in different ways.

First comes memory

Let us consider a few memory devices we use in our lives: hard disks, books, maps, and so on. To recall the information these devices have stored, we “observe“ the state of these devices. In primitive memory devices like books and maps, we observe the letters, lines, and perhaps colours and see how they are used together to convey ideas. In more complex engineered devices, observing the states is usually more involved.

Practically usable memory devices must have long-lived states. There is a reason we write on paper and not on the water. It takes a relatively significant effort to erase what’s on paper than water. Clearly, longevity is a desirable property of memory devices.

Let’s figure any other properties we may desire from a memory device. We want memory devices to support fast read and write times. We also like the devices to have a mechanism through which we can erase sections of memory. This is not a definitive list of properties, but the takeaway is that these properties help us evaluate different substrates that can be used to enable memory.

The simplest memory device would have two stable states (why not one?). Two states are necessary and enough to store a bit — a basic information unit (it actually stands for binary digit). Suppose we put together 100 such devices, we have a device that can store 100 bits resulting in a system capable of being in 2¹⁰⁰ states.

Digital devices store information as bits and mainly vary in the mechanisms used to enable stateful-ness. In a typical hard drive, we have a platter — a magnetic material, divided into billions of tiny areas. Each such area represents 1 if magnetized and 0 otherwise. In flash memory devices that enable RAM in our computers, we use micro-capacitors to store a bit. If a micro-capacitor is charged, it represents 1 and 0 otherwise. They are paired with circuits that perform reading and writing information. We encode information in many other ways: radio waves in wireless networks, laser beams in optical fibres, etc.

We have a panoply of mechanisms that enable memory. This isn’t surprising as information is abstract which enables us to engineer different ways to store the same information. This characteristic is called substrate independence, which means that the substrate, although necessary, is just a detail.

Computation uses memory

Computation is the transformation of memory from one state to another. It is also appropriate to call such transformations algorithms or functions. The complexity of the computations can be varying. They can be as simple as inverting the state of a bit or can be as involved as figuring out the next best move in a game given the current state of the game.

Earlier, we have seen how a blob of matter can store information. Let us now see how matter can compute. Suppose, we modify the state of the matter to represent our input for the computation. We then would want the matter to evolve according to the laws of nature and eventually result in a state that represents the output of the computation. Oh, we can go a step further, and say that if a blob of matter can evolve in a controllable manner, we can essentially define the computation and therefore have it perform desired functions.

Consider a system that can compute the NAND operation. Two bits are given as input, and the output is one bit, which is 0 if and only if both the input bits are 1. One practical way to build a NAND operation is by using transistors. With voltages interpreted as bits, (5V = 1, 0V = 0), and when transistors A and B are 1, they conduct electricity, and C drops to 0.

NAND circuit using transistors.

NAND operation is a functionally complete logical connective by itself. This means, with enough NAND gates you can compute any boolean function. A substance that can perform arbitrary computations is called a computronium and one of the ways to building one comes down to having enough NAND gates.

We already know several manifestations of a computronium: a trivial one is one where we have enough NOR gates (also functionally complete). Neural networks of arbitrary size are also shown to be capable of performing arbitrary computations and hence are also an example of computronium. In 1936, Alan Turing formulated a model called Universal Turing Machine, which can simulate arbitrary Turing machines on arbitrary input, which is also a computronium. Rule 110 Cellular Automaton, an example of class 4 automata, is also an example of a computronium.

In summary, any computation can be performed on any computronium, which can be realized in a plethora of ways. Therefore, computation is also substrate independent.

Learning modifies computation

Learning involves modifying computation. This enables a system to change its behaviour. The ability to learn can be understood as a way to modify one or more of its underlying functions which enables exhibiting any behaviour those functions can represent. For any system to be considered intelligent, it needs to be able to learn. Otherwise, it would need some form of pre-programmed knowledge that covers all the possible scenarios it would ever encounter.

So how would any matter be able to learn? Let’s look at a hypothetical argument. Imagine a surface made of soft clay, that is initially flat. Suppose this surface represents a grid of numbers. Repeatedly placing a ball at a certain number will create a valley over time and when you place the ball nearby, it would roll down into the valley helping the recall of the said point. One can start placing the ball at different places to learn different points. The substrate made of soft clay helps in unlearning the previous number and learn a new one. If this feels like a rudimentary example (it should), then it is more productive to focus on the idea that enables a substrate to learn — simply, the ability to rearrange itself or part of the self.

In the paper, Turing Computability with Neural Nets, Hava et. al. showed the existence of a neural network that is equivalent to a Universal Turing Machine. Such a neural network can implement any algorithm we can ever come up with. The infeasibility argument of achieving this with best-known implementations of gradient descent and backpropagation, I think, just goes to show that we have strides to make in finding better ways to learn, and not that we have no way to achieve it.

Consider a system that computes certain functions as part of pursuing certain goals. Such a system is adaptable if it can improve its functions in response to ever-changing situations. Only with such a capability, the system can learn and therefore adapt. We see why learning is crucial to intelligence and builds on top of substrates’ ability to compute. More importantly, learning is substrate independent as well.

The substrate is merely an implementation detail, implying intelligence may be realized in many different ways.

We have seen how memory, computation, and learning are abstract phenomena and the details of their realization, albeit necessary, can be ignored most times. The idea of substrate independence lets us take a step back and widen our perspectives towards these phenomena.

A pyramid showing how memory, computation, and learning build on top of the other.

To understand such substrate independent phenomena, it is all right to focus on the substrate when necessary but holistic understanding requires separating the substrate and the phenomena and studying the phenomena in its own right. In the case of memory, computation, and learning, substrate independence strengthens the argument that engineering intelligence is possible and potentially in several ways.

References:

[1] Life 3.0 Being Human in the Age of Artificial Intelligence, Max Tegmark
[2] Jeff Tyson “How Computer Memory Works” 23 August 2000.
HowStuffWorks.com.
[3] Enderton, Herbert (2001), A mathematical introduction to logic (2nd ed.)
[4] Turing, A.M., 1936–7, “On Computable Numbers, With an Application to the Entscheidungsproblem”
[5] Cook, Matthew (2004). “Universality in Elementary Cellular Automata”
[6] Siegelmann, Hava & Sontag, Eduardo. (1997). Turing computability with neural nets.

--

--