Our complex humanity and our machines
Humanity for the past 3 centuries has developed more and more technologies in an exponential way. We’ve created things that we’re calling machines, by machines we mean: human-made objects that are interacting with their environment. Those objects were at the beginning pretty simple like a weaving loom, then the level of automation increased, we’ve moved from the power loom and steam train to autonomous batteries and car factories.
But there is one thing that hasn’t changed since: humans are still in charge of designing every element even at low levels. But human technology is now at a turning point where we could even automate the designing part!
This new field is called Artificial Intelligence 🤖(A.I), and it might solve most of our most complex problems: from helping you to choose a coffee shop, to help exterminate famine or solve geopolitical conflicts or even take over the world.
I’m joking! Don’t worry, ……… or maybe you should…
You might ask yourself, how could this AI help us do all of this stuff? First, we need to make it smarter because today’s AI is already able to achieve very impressive results but we’re still pretty far away from our goal: human intelligence. To understand how we could make it smarter we first need to understand what AI is under the hood…
…Artificial neural network, at least in part! That is the most used technology nowadays when it comes to AI, so when you’re hearing about AI it’s generally either a neural network or a Deep Neural Network. Does it sound like rocket science? 🚀 Don’t worry I will explain it.
So first what is a neural network? The best way to understand it is to see what is looking like
Each little circle that you’re seeing is called a neuron. Every neural network has 3 types of neural networks organized by what we’re calling layers: The input neurons, the processing neurons, and the output neurons.
Now let’s imagine that you want to create a neural network that can tell if you’re showing it a picture of a cat or dog. Well, first you will need what we’re calling an input layer. This layer is tacking in input all the information that you want to feed into the neural network, so in this case, it’s would be your super cute cat’s picture. For each pixel of your picture, you would have an input neuron. But now that the value of each pixel is stored into our input neurons what are we doing? It’s where the second layer, the processing layer (or hidden layer) comes in.
This layer is the only one that is changing depending on the neural network you’re in. This layer like its name is suggesting is processing the data that you’ve fed in the neural network. The way this layer is working is petty simple, the processing neuron is summing up all the values of the input neurons and multiplying them with a number that we’re calling weight. Each processing neuron has an independent weight, and it’s this specific weight that is allowing the neural network to learn. But before talking about the learning process or otherwise called the training process we need to talk about our last layer, the output layer.
The output layer is used to output the result, it can take different forms depending on the goal of your neural network. It can be one neuron with a probability for a certain event to occur, a value that is representing the next data to create (like for a music generator), or in our case two neurons representing the two possible outputs: dog 🐶or cat 🐱. In fact, it will not be written cat or dog in our output neurons but rather a probability in each neuron like: 0.98 dogs and 0.02 cat.
But you are going to be like: “Jules that’s great but how do we make this neuron network learn?”
That’s a great question! To explain this we will take the example of what we’re calling supervised learning, that’s means that we’re actually going to tell to the computer if it prediction are wrong or not.
To start the training process we need data, in our case lots of pictures of dogs and cats. We’re going to split the dataset into two parts: the training (80%) and the evaluation (20%). After doing this we can start training our network. We are feeding our input layer with our first picture, remember the value of the pixels is stored by these neurons. After filling out the input layer, the values of these neurons will be sent to the next layer, the processing layer. This same layer will process the data with a randomly generated weight and each neuron of this layer will send the result to our output neurons. Now it’s time to see what are our results!
🐶= 0.52 and 🐱= 0.48
That kinda disappointing, isn’t it? Don’t worry, that’s a part of the process. But why it didn’t work? It’s because we are still in the first iteration of the training process! Training is a process, not a magic trick. Now it’s time to measure how much disappointed we are! This measurement is called the cost function, its goal is to determine how bad the computer should feel or otherwise said how much the predictions of the neural network were off. Doing this is pretty easy, we know that the picture that we fed in the neural network was your super cute cat! So we can tell our neural network that the output should have looked something like this: 🐶= 0.06 and 🐱= 0.94.
We can see that’s petty different from our output (🐶= 0.52 and 🐱= 0.48), doing the difference of these two outputs gives us the cost function.
By letting the neural network the cost function, it can actually now learn from its mistakes! We are able to do this by using an amazing algorithm called backpropagation. The backpropagation algorithm is an efficient way of using the chain rule of derivatives. Put in simple terms that mathematical propriety allows us to backpropagate cost function-based corrections of weights across the neural networks. To better visualize that let’s go back to our cat and dogs discriminator neural network. We now have the cost function that allows us to know how far the predictions are from reality. Like said upon the chain of the rule of derivates property is allowing us to go back in our network and to calculate for each neuron what its value should have been and to correct the weight of its linked neurons to produce the right output. And by doing this for each neuron and going back doing this layer by layer we’re modifying all of our weight.
When the backpropagation is done, our network can perfectly classify this picture, but not every picture. And it is why we need to have lots of different pictures and to backpropagate corrections lots of time to be able to find the right weight that is allowing us to classify accurately most of the dogs and cats pictures.
Take a look, our neural network is working!!
Now that we have a basic understanding of what supervised learning and neural network are doing under the hood, let’s get back to your brain for a moment:
In your brain you have around 80 billion neurons with about a trillion other cells. They are consentaneously firing thousand times per second to make you able to read this sentence, to understand it, and to remember it. Your neurons are contentiously modifying their strength of connections (weight) and their firing rate. They are soo well optimized for the task that some tasks are effortless. But I mean really effortless. And I’m not talking about seeing or another trivial action like this one; let’s do a little experiment:
Try to not read the following word:
Apple
Have you been able to just look at the different letters, to see them one by one without earing a voice in your head or understanding the word? If not try again.
Still not able to do it? Don’t worry that is intended. Since you’ve learned how to read when you were little your brain has progressively increased the strength of the connections between the neurons used during the reading process and synchronized their firing rate to make this action effortless. But it’s now such a well-optimized task in your brain that you’re not able to not do it. Isn’t magic?
Computer scientists and neurologists have been trying to reproduce this magical yet complex process in computers. Like you’ve seen above, nowadays we’re able to do more and more complex tasks using neural networks: detecting cancer, funding new drugs, vaccines autonomous driving, and even more! Even if it’s really impressive and useful like I told you before, we are still faaaaaaaaaaaaar away from the human brain (more on this after).
Before jumping to how we could do better let’s take a look at where we’re coming from and how we’ve tried to mimic our biological neurons.
This is the basic representation of a biological neuron from the human brain.
Yeah, it looks pretty simple, right? Well when simplified it is, because our litlle model of neuron is composed of:
- a Cell body (stoma): responsible of keeping the cell alive and process information
- Dendrites: are receiving the chemical or electrical information of the other neuron and lower or increase the strength of the signal (weight).
- Axon: who transports the information to another neuron
Now that we know what it looks like? How does it work?
The key concept of neurons 🔑 is that they have what is called an action potential. The information of the postsynaptic neurons (all the neurons connected to the dendrites) are summed up and modified by weights and if it’s able to reach a specific level (the threshold potential), the neuron fire.
Once the threshold potential is reached, the neuron fire or send a spike. This spike is propagated and accelerated in the Axon to be sent to other neurons that will do the same process to determine if they need to fire or not.
Based on this simple model McCulloch-Pitts creates in 1940, the first artificial neuron. His first model is using a very simple threshold function to be able to give a binary output: 0 or 1. Its weight cannot be modified and it cannot learn. It is only a classifier! Not that impressive right? Well stay around it gets more interesting.
Like we’ve just seen this first generation of artificial neurons (1940s to1950s) was pretty basic and highly limited, it was losing information in the threshold function, unable to learn (no weight modification), and not able to be stacked up in big networks.
That’s why came the second generation of ANN (1950s to 1990)s, this generation included lots of improvements. The threshold function evolved from a binary to a smooth sigmoid or radial basis function (RBF), instead of a fixed threshold value.
That means that we can now use the magnitude of the spike as information. The information of the postsynaptic neuron is not any more lost once passed into the activation function, this can give an approximation of the firing rate of the neuron. This new threshold function allowed the use of the backpropagation algorithm that we used earlier for the cats and dogs classifier. With backpropagation, we are able to stack up more and more layers to create what is called a Deep Neural Network..
But why would someone ever want to stack up 15 layers in a cat and dog classifier if a regular 3 layer neural network would achieve the same result?
It’s because Deep Neural networks are working kinda differently, let’s take an example using a CNN a type of Neural net which is pretty good with images). A regular neural network, once trained would be able to be pretty good at pattern recognition, but only images that are close to those that it has been trained on. If you’re resizing the input image or doing some modification on it, the neural network will fail at classifying the image. So we need to have a neural network that is more “general”, to do that we’re going to instead of building a neural network that is only directly detecting the wanted pattern, we will build a Deep neural Network where each layer will detect a primitive pattern that will allow the entire network to detect the wanted more complex pattern.
Let’s look at this in a more visual way:
All of this is what today’s AI is in majority using. But is really the second generation of neural networks the final version? Are they perfectly mimicking biological neurons?
Short answer is: no
That’s why there is a 3rd generation of artificial neural networks, spiking neural networks.
You remember our model of a neuron, well we need to scale up at the level of all neural networks to understand how spiking neural nets provide a more realistic representation.
Our neurons, when firing are not encoding information in the magnitude of the spike, (like it has been done in the second generation of ANN). The information in your brain 🧠is encoded temporally.
In fact, in biological neurons, the weight of the postsynaptic neurons is not the only information, the temporality of those spikes is really important too. The firing rate of neurons and their synchronization is in big part responsible for the learning process.
There are serval theories and there is no consensus yet, but the neural synchronization process could be responsible for the fact that you’re able to see the blue car 🔵🚗. To understand that this object is blue 🔵and is a car🚗, you need something else that our classic model doesn’t have. Think about it, you don’t have a specific neural network that can recognize a blue car, your brain is able to recognize cars, and the color blue, separately. What is making them considered as the same object?
You don’t have a specific developed path or weight modification that is allowing your brain to associate them. Neural synchronization could be the answer.
When you’re seeing the car, different parts of your brain are activated at the same time, this time specific firing encoding modifies the “value’ of the spike of those networks and allows you to perceive the information as the same entity. 🚗+🔵
That is a demonstration of how useful time encoding information is, and what we’re missing with the second generation
Spiking neural networks are basically more accurate models of our biological neurons. They are allowing us to develop potentially more accurate and more capable deep neural networks but they are also allowing us to simulate biological neurons. This could allow us to develop simulations of living organisms’ brains. C.Elegans could be a good candidate, given the work that has already been done in it.
I think that learning to simulate biological organisms will eventually lead us to an important breakthrough in this field and could even lead us to germ AI.
Speaking of improving AI and breakthroughs, how could we make AI understand human’s way of thinking?
First of all, we need to see if today’s AIs are really intelligent? Even if today’s neural networks are able to achieve great things, it’s still not intelligent from a human perspective.
Let’s imagine that we’re showing to a 3yo child pictures of cats and dogs, he will be able to class every one of them like dogs or cats. Now you’re showing him a bus picture 🚌, he will probably react like that:
But our dog and cats neural network will find that totally normal, it will only give a low probability for the bus to be a dog.
Now let’s show to this child a picture of a cat without an ear. He will instantly focus on that missing ear, worst he will focus on something that does not exist.
Our neural network will simply ignore it, the human mind is focusing on something that the neural network is ignoring.
But how are we able to be surprised by something or to notice that something is wrong? Because of the contrast operation, this operation is comparing the situation that we’re facing vs what we’re used to.
This is not only useful to find anomalies but to decide if a piece of information is true or not. It can allow an AI to say “no this is too big to be a cat” and to do this with an explanation! Explanation is one of the big lacks of today’s neural network, they are big black boxes.
But wouldn’t a neural network be a mathematical function? You gave it an output and it is giving you an output based on a complex mathematical function that it is trying to match. Well of course it is, but the issue here is that it seems to behave like a reflex. It is not thinking it has only learned a function defined by already known examples (in supervised learning).
But you might think… “wait Jules… aren’t human’s brains behaving like a mathematical function, too? I mean we have in permanence various inputs, and our output wich is our reaction is often based on already known examples.”
Well, you’re right, we are a much more complex one but it is still a mathematical function!
The question now will be, what is the difference between a reflex and a reflection?
A reflex is simple and specific to a task
A reflection is more complex and able to work on various different tasks
So isn’t it what is missing to make AIs able to think rather than act like a reflect?
It is indeed what is missing, and that is our vague path to achieve AGI (artificial general intelligence).
But before talking about AGIs, wouldn’t be useful to make our reflex AIs better able to understand humans?
AI will take more and more space in society, and if we want AIs and humans able to work together, we need to figure out how AIs can be better at understanding how humans are thinking and feeling.
One of the most interesting theories about this subject is called “The Simplicity Theory”. This theory has been developed by a French researcher that I’ve met: Jean-Louis Dessales.
This theory is stayting that everything that is surprisingly simple to describe is interesting for the human mind. For example, on the 10th of September 2002, a strange event happened. The Bulgarian Lotto drew the following combination: 4, 15, 23, 24, 35, 42.
Nothing incredible right? This combination had only 1 chance out of 14 million to be drawn. What is surprising is that the same combination has been drawn the day before. But why does it feel surprising? This sequence of numbers has the same probability to happen as 1, 2, 3, 4, 5,6 or even 55, 78, 65, 2, 3, 75.
Let’s just imagine that the same sequence has been drawn a few months ago, a year or two. It would seams less surprising, right? Well, it is because our mind is able to describe that complex sequence number that you would never be able to remember in “this is the same sequence as yesterday”.
What is fascinating is that with that theory we’re able to calculate emotions. Okay not exactly calculate them, but able to predict their intensity. Let’s imagine that tomorrow you’re learning that a factory burned near your house and that people that you knew have been hurt. Well with simplicity theory we’re able to predict that this will have a high emotional value compared to the same event happening in another country, with people that you don’t know.
You’re emotionally more moved by the first news because of its simplicity to describe, near home is a very simple yet accurate description, people that you knew is also a very simple thing to understand.
Using contrast and simplicity theory we could make reflex AIs seem way smarter, from a human perspective.
That will be a great step, right? Imagine an AI able to map your interest in the world and to anticipate your emotions, wouldn’t it make a Super Alexa?
But if we want to go above Super Alexa, we will need to take a totally new approach.
I think that the right path to smarter and more autonomous AI systems and eventually AGI is: germ AI.
The idea of a germ AI is to create an AI able that is not only able to improve itself but to create totally new features and capabilities.
I think that germ AI is most the likely path to achieve a higher level of intelligence and is the only path to achieve an explosion of intelligence. A biosimulation + evolving environments + developing neural network could create the next chapter of AI progress…
AI and AGI could be the two greatest inventions of humanity, helping to solve our most complex problems, creating an unbelible wealth and unilimited possibilities.
Till that happend, I hope you will be able to contemplate the complexity of our humanity and our beautifull machines.
I’m Jules Padova, a 16yo TKS innovator who is trying to improve our world through AI 🤖, neuroscience 🧠, biotechnologies 🧬, and fundamental physics ⚛️.
Thanks for reading this article, I hope you’ve learned interesting stuff and that it made you think about our future with AI.
If you want to follow my progress, follow me on Linkedin, Twitter and subscribe to my newsletter!