If robots spoke of God

Artificial intelligence, creativity and how our goals shape our reality

13 min readJul 9, 2020

Eureka! As water sloshed over the side of the bath, the solution popped into Archimedes’ head. The problem he had been toiling at for days was finally solved. All the components of the solution slid neatly into place, a gift from his subconscious.

Syracuse, the birthplace of Archimedes. Photo by Luca N on Unsplash

Anyone involved in creative work or problem solving will have experienced such eureka moments — a solution presenting itself in a flash of insight. In his latest book Novacene, James Lovelock offers his Gaia hypothesis as a potent example of this: Gaia is not easy to explain because it is a concept that arises by intuition from internally held and mostly unconscious information. Solutions obtained in this way can often be deconstructed into their component parts, but are not initially achievable with straightforward step-by-step reasoning. Too many things would have to be held in mind simultaneously. It is our ability to think in this non-linear way that makes our minds so powerful, ensuring that engineers and scientists will always find solutions more easily than a computer could. Put simply, there is no straightforward algorithm for creativity.

But, with the ongoing revolution in artificial intelligence, this is starting to change. We are finally working out how to get our computers to find non-linear solutions. With access to big datasets and enough computing power, many of the problems that were always difficult for computers — natural language processing and image recognition for example — are starting to be solved through the application of machine learning. There has been particular success in the use of Deep Neural Networks (DNNs), which can surpass human performance in many well defined tasks. Notably defeating the world Go champion in 2015, a feat many experts in computer science thought was still many tens of years away.

A very brief introduction to Deep Neural Networks

DNNs were originally conceived as an algorithmic emulation of the human brain in the 1940s, but they did not show their full potential until their voracious appetite for data and computing power could be fulfilled. Their power is encompassed by the fact that they can act as universal function approximators — a perfectly trained and large enough DNN can find all potential insights from any input. But, despite their long history, there is no precise theory of how they learn, so no one really knows why they work as well as they do. This means there are no prescriptive rules detailing how to structure a network for a specific task. Instead, it must be worked out through experimentation. One of the major breakthroughs from such exploration was a simple but crucial one — the networks went from being shallow to deep.

Neural networks are constructed from two basic components, nodes and the directional links between them — the combination of a node and its links being analogous to a biological neuron. Each node receives numerical inputs from other nodes further up the network. These inputs are summed and fed through a non-linear activation function. The result of this is then passed, via the links, to nodes further down the chain. As each link carries a signal from one node to another, the signal it carries is amplified or attenuated by the link’s weight. The key to training a network is choosing these weights in such a way as to get your desired output for a given input. An image recognition DNN, for example, takes as its initial input all the values of the pixels of an image. It then passes this through several layers of nodes and outputs a probability that the image was one of a particular predefined class.

The process of choosing these weights is known as training a network. The prevalent method for carrying out this training is with a large dataset of labelled data — inputs where you already know the correct output. This data is systematically passed through the network, which is asked for its outputs given a set of inputs. The outputs it produces are compared to the correct outputs through a loss function, which is defined to be minimal when the network always makes the correct prediction. Based on how wrong the network’s guess was, its weights are changed in such a way as to reduce the loss function’s value, through a process known as backpropagation.

The paradigm of minimising a loss function is a powerful one, it allows neural networks to do much more than find patterns in large datasets. A loss function is really just another way of expressing a utility function, a concept familiar across science, mathematics and philosophy. As any rational and goal-driven behaviour can be formulated in the form of utility maximisation, there is no theoretical limit to what neural networks could do. The difficulty comes in choosing the appropriate loss function and finding the appropriate network structure.

This is where the depth of networks comes into play. It turns out that stacking layers of neurons on top of one another works much better than putting lots of neurons side-by-side in one giant layer. In this configuration, the network somehow learns to encode a hierarchy of abstraction in its layers. In the case of face detection, for example, the first network layers act as edge detectors, whereas the nodes in the final layers act as facial feature detectors, spotting the presence of eyes and noses. This structure emerges spontaneously in the training process and is eerily similar to the way we, as humans, seem to understand the world.

Facial recognition representations taken from mdpi.com, originally attributed to LeCun, Bengio, Hinton.

Neural Networks learn how to represent their world

The hierarchy of representations that DNNs build is both the key to their success and the core of their mystery. A typical DNN is structured to produce progressively more complex representations of its input with a neural architecture that is specific to the task at hand. The process of producing a specialised set of high level representations is known as encoding. These representations are finally combined with a couple of layers of simple logic to create the final output of the network, in a stage of simplified inference. In the case of image detection, the pixel values of an image are encoded into high level representations by many convolutional layers — a network architecture that works very well for image processing. The simplified inference is carried out by one or two unspecialised layers of neurons, which make use of the representations to decide which of the predefined image classes the input picture is most likely to belong to. In face detection, for example, the simplified inference stage will look to see if there is sufficient activation of the neurons that represent eyes and noses.

Most of the heavy lifting of the DNN is carried out in the encoding layer. Learning good representations is what takes most of the time and computing power. This leads to one very nice property of DNNs — the same representations can be used for a wide variety of tasks. For example, the representations built by networks trained to classify images of a thousand different objects can be commandeered to match instagram influencers to brands based on their posts contents. The huge potential of DNNs for natural language processing also revolves around this concept. This is exhibited by BERT, Google’s powerful language encoder, which is trained to predict words removed from text.

The obvious advantage of reusable representations is that it allows researchers to build on the success of others, without costing them lots of computing power and data processing. This hugely accelerates the process of engineering state of the art DNNs for a wide variety of tasks. However, the implications of transferable representations go deeper than quality of life improvements for machine learning researchers.

Somehow, DNNs are able to learn to build complex and sophisticated representations of the world that have a universal utility. This has allowed DNNs to break new ground in computer science, but — as DNNs demonstrate their ability to learn fundamental natural truths from data by themselves — they are starting to revolutionise many other scientific fields as well. A recent application in material science, for example, sees DNNs predicting the properties of materials, a task that would have previously taken many years running a full algorithmic simulation on a supercomputer. In this case, DNNs are somehow able to learn the sophisticated and complex characteristics of various materials and build them into a set of easily interpretable representations. Such applications are popping up all over science, from DNNs finding mathematical proofs to understanding the cosmic structure of our universe. But, the blackbox nature of DNNs is starting to lead scientists to question how far they can be taken.

A crisis of explainability

The problem with getting a DNN to build a complex and sophisticated non-linear representations is that it becomes very difficult to understand which features of the data it is exploiting. We are comfortable thinking through a few logical steps, but not the thousands or even millions that are carried out at once by a DNN. This makes it almost impossible to explain the exact reason a network gave us a particular solution. It just seems to be able to get it right.

One danger of this is that our networks start making decisions with undesirable patterns found in the data. They can, for example, learn about systemic racism when trying to predict which criminals are likely to reoffend. This is potentially very harmful, although can be corrected for with awareness of the problem and appropriate tuning of the training process. But, when our aim is to build up scientific theories from first principles, the lack of explainability presents an even stickier challenge. Can you really say you have understood a problem if a blackbox can get you the right solution, but you can’t explain the mechanism underneath?

This leaves us with a dilemma — the complex non-linearity of DNNs is what gives them their power, but also what makes them difficult to interpret. There will always be techniques to try to reconstruct some of the patterns that DNNs are making use of. But, if we want to model complex systems that have up to now escaped straightforward algorithmic approaches, maybe we have to accept we won’t be able to know all the details. In the next section I will argue that we actually accept this kind of thing all the time — in our eureka moments…

Our brains are neural networks, after all

As discussed before on this blog, we humans also think with representations. This conveys many benefits when interpreting the world, from allowing us to communicate easily, to compressing the amount of information we have to store when we remember things. The higher level representations that our brains construct are so fundamental to the way we think that they comprise most of our conscious experience. When thinking in terms of images, we manipulate entire representations of objects. When thinking verbally we manipulate words that contain a huge amount of information in a few syllables. However, these representations come to us unconsciously — they act as a baseline of our experience that we are unaware of building.

We can now immediately see the connection between the way we interpret the world and the way neural networks learn to interpret the world. We also have an encoding process — which is almost entirely unconscious — where we build high level representations. We then manipulate these representations with simple logic, as in the stage of simplified inference in DNNs. It is only this last stage that we are typically conscious of, allowing us to communicate a simplified version of our internal world to others, or think step by step through problems we want to solve. In the same way it is difficult to explain the depths of a neural network, we have difficulty explaining the deep workings of our subconscious brain.

Could it be this that explains the appearance of our insights from outside our conscious awareness? To solve a sophisticated problem, our brain carries out reasoning that is too complex to have a consistent conscious experience of. Perhaps it does this by building new representations of the world, or even just manipulating so many variables simultaneously that they cannot be all held in consciousness?

By allowing our artificial neural networks to train themselves, we are therefore giving the computers the gift of proto-creativity. It may therefore be impossible to hope to fully explain the inference power of the deepest and most powerful DNNs, any more than we can understand our own creative problem solving process.

This is not to say that human brains are not hugely more powerful and sophisticated than your average neural network. This is evident in our ability to dig deeper and offer explanations of the representations we build. We can break down our mental representation of a dog into its individual components — ears, nose, fur — that led us to represent a particular object as a dog. But, here we are just drawing from the huge set of representations that we have available to us. A typical DNN has a much smaller set, specifically tailored to the task at hand. This idea could point us to methods for improving the explainability of neural networks. But, for now, I want to explore how these representations are arrived at in the first place, leading us to a strange and interesting conclusion…

Our reality is defined by our utility function

It is all very well stating that both we and our artificial neural networks make sense of the world through representations, but how is it that they are arrived at in the first place? There are infinite ways of clustering and combining input data, what is the guiding principle for deciding the best way to do it? For DNNs this is actually well known — it is decided by the utility function used to train the network.

As already discussed, the neurons in the encoding layers of the network automatically organise themselves into useful representations. It is this fact that leads to the fundamental mystery and power of DNNs. Neural networks, therefore, learn to see the world in the way that is best suited for them to achieve the goal that they are trained for.

This exact reasoning applies to how we represent the world. Our representations must be built in such a way as to maximise our ability to achieve our own goals, defined by some human-level utility function. If this conclusion is close to the truth it is of earth shattering importance. It is these representations that comprise our entire conscious understanding of the world. All of what we think of as reality is necessarily a mental representation of it.

This is exactly the idea Donald Hoffman presents in his Case Against Reality, explored from another angle before on this blog. We do not actually have a conscious experience of reality, just a conscious experience of our representations of it. There is therefore no guarantee that we experience a faithful reproduction of the real world. The one underlying assumption for this idea to work is that the utility function of humans is that defined by evolution. In other words, our representations of the world are those that maximise our ability to successfully reproduce and propagate.

Compellingly, this conclusion is simultaneously being arrived at in the world of artificial intelligence research. Specifically in the field of reinforcement learning — where agents driven by neural networks are trained to interact in a world and complete predefined tasks. A recent paper from Google’s DeepMind found that agents learned better when allowed to construct a model of the world themselves, rather than when a model was explicitly provided for them. This is a counterintuitive result. You would usually expect that giving an agent lots of information about the world in which it was acting, via a predefined model, should allow it to learn more quickly. It wouldn’t have to spend time learning how to model the world and how to move in it, it would just have to learn how to move. However, the agent did much better when allowed to construct its own representations of the world, rather than trying to make sense of the ‘real’ world in which it found itself.

It is true that reinforcement learning agents have a much more simplified task to complete than that of a thriving human being. It is also self-evident that we don’t explicitly act to maximise our evolutionary fitness. Instead we try to fulfil a series of proxy utility functions, which work together to have the net effect of increasing our fitness — we eat when we are hungry, find partners with which to start a family and so on. The exact structure of the utility function that humans use to make sense of the world for humans is therefore opaque. There are compelling ideas on offer, but a definitive answer has yet to emerge.

To make successful inferences in this complicated world we must have a pretty good representative model of it. But, we can’t escape from Hoffman’s theory — compellingly backed up by cutting edge AI research — that we may well be designed with irretrievably incomplete representations of the world around us. As humans, can we really ever hope to comprehend reality as it is? Maybe we can look to the way our machines are learning to start giving us insights into this most ancient of questions.

In any case, it is certain that there is much wisdom in the depths of our subconscious — we would do well to take full advantage of our eureka moments. As noted by Lovelock: the “A causes B” way of thinking is one-dimensional and linear whereas reality is multidimensional and non-linear. We are finally inventing the tools we need to explore and exploit this fact.

This essay was originally published at https://pursuingreality.com . If you enjoyed the article you can find other articles and sign up to the mailing list there.