The Core Computational Principles of a Neuron
To spike or not to spike?
It is well known that a neuron is the main cell of the nervous system. It generates electrical pulses as large as 0.1Volts, has sprouts — axons (as long as 1 meter) and dendrites (several millimeters), and lives more than 100 years. You will not find another more marvelous cell in the body. Still, I will not write about biological details (despite how interesting they are). Rather, I will try to define the most important computational principles, which might help with the progression to artificial intelligence (AI).
Convergence towards principles always leads to simplification and generalization, but it can be dangerous. The brain is too diverse. You can simplify too much and lose something important. Let’s dare to try. Ultimately, it is an iterative process. We merely want to become less wrong with time.
Principle 1. Neurons are binary
A neuron generates and transmits electrical pulses (also called spikes). They are very similar in amplitude and duration, so they can be treated as binary events (spike=1, no spike=0). Yet, we don’t fully understand how exactly thousands and millions of neurons represent knowledge, perform basic calculations, and store casual relationships.
Some neurons transmit information by the frequency of spikes. For example, the more a motor-neuron activates in the spinal cord, the stronger the muscle contracts. For a long time, people thought that other neurons in the cerebral cortex send information in a similar way. After the famous Hubel and Wiesel experiments, we know that a neuron in a visual area has a higher spike frequency for a particular stimulus. If the cat sees the vertical line, a particular neuron activates very fast, but if we rotate the line — the neuron slows down. This can be summarized in a tuning curve:
For many decades, the idea of frequency coding was dominant. Even now, artificial neural networks use the same coding (real numbers). But a nasty problem ruins the frequency view: to “measure” the rate of firing you need to wait for a long time (hundreds of milliseconds) to average spikes.
However, the brain computes much faster. Humans need around 100 ms to recognize an object. Information travels from the retina to the thalamus, then is projected to the visual cortex, where it hops across three or four areas before reaching the IT area. Here the recognition of an object occurs. Taking into account the time for spike generation (2–5 ms) and spike propagation (roughly 10–20ms), there is just enough time to send 1–2 pulses… and no way to wait longer to “send” the frequency of firing.
Therefore, individual spikes (0 or 1) somehow should be enough to transmit information. Currently, the prevailing idea is that an individual neuron does not send a lot, but many of them can encode anything (population coding).
Once, my friend asked me: “How does the brain encode information? Perhaps with something super complex?”. “No”, I said. “Most likely, it uses a binary code similar to computers, but the principles are very different.” Neural coding is one of the most compelling areas of research with many mysteries and open problems. Intriguingly, it relates nicely to information theory. It is not just about data representation, but about something fundamental, about the organization of matter to reflect nature.
Though we are still far from fully understanding the neural code. There are bursting neurons that emit many spikes very fast. Can we simplify the burst of pulses as one binary event? Neuromodulators change how the neuron is excited. Possibly, different brain regions (like sensory and motor) use different coding schemes. Yet, most likely the main tasks like recognition, memory, and planning use binary codes (that can be modulated in various ways).
Principle 2. Neurons integrate information
A typical neuron in neocortex has around 10 000 connections and more than 100 of them are active simultaneously. Information integration means that a neuron “hears” all 10 thousand “calls” and compresses them to 0 or 1 (spike or no spike). Mathematically, 10 thousand neurons can be active in 2¹⁰⁰⁰⁰ possible ways (0 or 1 for each input). We could enumerate and put all of them into a table (two columns, 2¹⁰⁰⁰⁰ rows) but even the whole universe could not store it. Alternatively, we can write the table in a compact form as some function (y=f(x)).
Maybe the biggest advantage of a neuron is that it can change its inner parameters (like the weights w in artificial neurons). Biologically, these parameters are realized as synapses between neurons, or as a distribution of ion channels on a membrane, or other molecules that determine the excitability of a cell. By changing the parameters, a neuron can change its function, its answers to 10 000 inputs. Parameters add new columns to this gigantic table and can be written as a parametric function y=f(x, w). The number of different functions the neuron can realize is called expressivity. The larger the expressivity, the more ways a neuron can integrate information. And it is crucial for learning, which is essentially a search for the right function.
For the last 20 years, people showed that the expressivity of a biological neuron is much larger than previously thought. It turns out that the dendrites do not simply transmit the signal to the center of a cell but process it along the way. If a local dendritic branch gets activated too much, it can amplify the signal (an event called a dendritic spike).Thus, synapses that are active close in space and time excite the cell much larger than simultaneous but non-local signals. This is important: read it once more. Here is a helpful picture.
On the left, three axons (in color) bring the activation from three neurons and spread their synapses randomly. On the right, synapses concentrate locally. The right neuron is excited more, and more likely generates an action potential (the figure shows three axons for simplicity, usually a neuron needs more to fire).
Therefore, not only the strength of a connection is important but its location. Because of this, people argued for a new learning paradigm: not only the weights but their locations also store information.
The larger the dendritic tree, the more ways to arrange synapses, thus the more ways to activate the neuron and the larger the expressivity. Interestingly, evolutionary smarter animals have a larger dendritic tree. The picture below shows the pyramidal neurons of 1.Bat, 2.Rat, 3.Cat, 4. Dolphin (source in Russian). Similarly, humans have more elaborate dendritic trees compared to chimpanzees (source). Maybe, with more powerful neurons animals get higher intelligence.
Old models of a neuron (like ReLU or threshold) are called “point” models. They do not include dendrites and are computationally weaker. Such models people still use in artificial neural networks and do not want to adopt new, more complex neurons. One of the reasons is that it’s not clear how to learn a new model with active dendrites. Another reason is that by collecting together many simple neurons you can achieve the same expressivity as a single complex neuron. So, more or less, a two-layer network corresponds to a biological neuron. Read a really captivating article that explains it in depth.
Yet, even this is an overly simplistic view — a real neuron is more powerful.
Principle 3. Neurons integrate information in time
Spike or not to spike is the final result of all preceding computations. Still, it doesn’t simply discharge like a capacitor. The neuron remembers what happened before. Mathematically, a neuron is a complex dynamical system with time-varying parameters. Simple neuronal models can not describe all phenomena. For example, the same inputs at one time elicit activation and at another — don’t. A neuron remembers not simply a correct combination of neurons among 10 000 but when this combination occurs. As if the neuron looks over the window T back in time and makes a decision (0 or 1) based on (2¹⁰⁰⁰⁰)^T possible scenarios. It opens the way for remembering sequences by a single neuron.
One such mechanism of integration across time is the control of neuronal excitability. Once a neuron becomes active from a stimulus “A” it can change its inner parameter (for example the concentration of the protein CREB) to make itself more easily activated in the future. Even after many hours, this neuron will be activated from another stimulus “B” with a higher probability. One neuron gets activated for two stimuli and links them together. Since the memory is stored distributed across many neurons there will be an overlapping population of cells that stored a causal relationship “A->B” . Changing neuronal excitability may be the way we form episodic memory (here is a nice review).
Another mechanism is when a neuron stores sequential information in other neurons. Active neurons excite other cells and shape a particular network activity (called the “context”). After a certain time, the context may cause the activation of some other neuron. This neuron encodes the previous history (a sequence) that shaped that context. This is the domain of recurrent neural networks in which it is extremely hard to unravel the chain of computations.
Yet another way how a neuron preserves memory is via its spines, the places where axons attach the dendrites. They also have inner parameters that can be changed to store information. One hypothesis (synaptic tagging) proposes that spines “remember” the previous activation by expressing some tag. Later, the activation of a synapse interferes with the tag, which makes the connection stronger. That is why sometimes people argue for learning of different time scales (like here). Some parameters can change quickly in response to rapid events. And some change very slowly, reflecting things happening over minutes, hours and more.
The lesson is — the neurons are very variable. They can change themselves (ion channels, spines, other proteins…) to record the spatial and temporal activation of the network overall.
Principle 4. Neurons learn … to survive
A neuron is a living cell, and like any other it has needs. It consumes a lot of energy to fire electrical pulses. This energy must come from somewhere. Here neurons receive help from astrocytes, supportive glial cells, that live close to neurons, with around one astrocyte for three neurons.
Astrocytes sense when the neuron becomes active and change the permeability of blood capillaries to increase the flow of oxygen and glucose. These molecules are consumed by neuronal mitochondria that stores energy in the molecule ATP. Once, I had an Aha moment: “The goal of a neuron to become active to get nutrients. It activates and learns to survive!” Subsequently, I did not find any facts that support or reject this idea (that neuronal activity is the only way to get “food”). But the idea is interesting.
Another similar and verified idea is homeostasis: a neuron sustains a target level of firing. If the activity exceeds or falls under the target level, the neuron launches the mechanics to return to this level. For example, a neuron can strengthen or weaken its connections (synaptic scaling) or change its threshold for firing an action potential.
An interesting idea is that to be active just enough neurons change not only themselves, but try to change other neurons. They compete and cooperate with each other. A neuron tries to become active and prevent its neighbors from firing, a process called “winner takes all”. Or neurons try to “help” other distant neurons by activating them (hoping they will return the favour sometime). Of course, it is just a story where neurons, like warriors on a battlefield, survive, fight, create alliances. Nevertheless, it helps to link many experimental facts. Neurons indeed can activate local inhibitory cells that in turn decrease activation of other local neurons (“competition”). In addition, neurons in the visual cortex create distant connections with other neurons that encode similar stimuli (“cooperation”). Still, to confirm the story many experiments need to be done. And hopefully we will answer one of the most important questions in learning — how neurons choose with whom to connect.
I wrote that a neuron tracks the combinations among 10 000 connected neurons. Actually, it learns to recognize among 100,000 neurons. If two neurons are not connected but the axon of the first is close to the dendrite of a second, a connection may arise. It is structural plasticity — the creation and deletion of connections. A typical neuron in the neocortex has approximately 100,000 potential candidates to form the connection. What a possibility to reshape the network!
Structural plasticity is prominent especially in children, who have a lot of excessive synapses. With aging (till around 30 years) the connections are pruned and the most valuable remain. The speed of this process varies in different brain regions: in some, like the parietal and frontal cortex, it takes longer to fully mature. But how to choose which connections to delete? One of the answers is the principle “use it or lose it”. Connections that are used rarely get eliminated. It is a highly selective procedure — one axon may climb to a particular neuron avoiding everyone else on its way.
Many theories of synaptic plasticity explain the changing of the strength of connections. Still, it is not completely clear how precise this strength might be. In artificial neural networks, the weight is a real number, for instance, 0.63737. True, neuronal spines vary in shape and size and a neuron can change the number and effectiveness of ion channels through which the positive ions pass and excite it ( ehh, I have to skip so many fascinating details). Some estimates show that a synapse can store 4.7 bits of information, so it has 2^(4.7)=26 different states. Hence, learning is the change from one state to another. However, it is still debatable how long and how reliably the spine can remain in a given state. At least, we can treat the spine as one bit of information: are the two neurons connected? (And a couple more bits for the location of the synapse on a dendrite).
Principle 5. Neurons make errors and it’s okay
In reality, synapses are very unreliable and stochastic. Often, neurons fail to transmit the signal, neurotransmitters may be “stuck” inside the axon. The successful transmission happens with some probability. That is why it is hard to reason about the strength of the connection. Interestingly, it may be not the weakness of evolution that could not invent reliable wires, but the discovery of how to make learning more efficient. Some theories suggest that stochastic connections may actually help. Neurons in some animals can change the probability of synaptic release as a way of learning (see a good review). How common is this phenomenon in the human brain? How important is it computationally? We don’t know. New research is needed.
Not just synapses, but neurons can be noisy as well. Sometimes they can even die. In some brain regions the death of neurons is very bad. Like in the brain stem where neurons control breathing, blood pressure, and sustain life. However, in most regions neurons might die and nothing terrible will happen.
In contrast, the loss of some transistors could damage the whole chip and the computer stops working. Brain architecture is evolved to be stable to errors, to failure of neurons and synapses. Information is encoded in a large population of neurons, from multiple modalities (vision, hearing…). Thus, there is no grandmother neuron, the neuron that encodes only the grandmother. Otherwise, when that neuron dies — the memory is lost. One neuron does not encode only one stimulus but many. One memory is encoded in many neurons. If some cells make errors, no worries, the information remains. In case of massive cell death, like in a stroke, a function of the neurons might weaken, but later neuroplasticity reassigns the overload among other neurons.
This is an important principle: if you build AI with brain-like neuronal networks, you must ask yourself “will my algorithm still work if I remove 10% of neurons? Or 20%?”
Surely, this was not a complete list of all principles. But these five are among the most important.
Hopefully, you will remember that a neuron is a living cell. It doesn’t care about information processing and computation, it just wants to survive. Yet, it is fascinating that, while surviving, a neuron makes computation.