When you’re caught in a 3,000-year-old trap in the inner chamber of the pyramid of Ramses II, you need to discover the escape mechanism, fast, before imminent impalement by the descending ceiling of spikes. Having wildly flailed around, randomly pushing bricks in the hope one would slide in, running your hand over the carvings on the sarcophagus, thrusting the gold mask back on the plinth — the ceiling suddenly stops.
You freeze, panting, and think: Did I do that?
We’ve all been there. Perhaps more often it’s the random jabbing of the phalanx of buttons on your brand-new, absurdly overspecified microwave oven, which suddenly springs into life. Whether life-saving or microwaving, to know that X caused Y, your brain has to solve two problems at the same time:
The agency problem: Was it me?
The credit assignment problem: Of all the things I just did, which one caused Y?
Solving these two problems is a fierce challenge. We take many actions, and there are many outcomes. But more things are happening around us all the time than we are causing. So, our brain needs to isolate, out of this constant stream of things, the crucial outcome Y. Then our brain needs to work out whether we caused it, despite the sensory information about that thing only turning up sometime after the actions that might have caused it. And it likely all hinges on that Swiss Army knife of brain theories: dopamine.
We have a detailed hypothesis for how neurons assign agency and give credit where credit is due. It’s built on two big ideas. First, our brains have a model for how the world works, and this is constantly making predictions about what should happen. When these predictions are wrong, this is surprising—and the event that caused surprise is thus isolated from the stream of constant, predicted events around us.
Second, our brains have a trace of the actions we just performed, and when the surprising event occurs, it can be linked to the trace memory of the most recent action(s). Then, once linked, we can repeat the action and test if we get the same outcome. If we do, then, voilà! Evidence for causality.
Enter our old friend dopamine. Now, on the face of it, dopamine might seem the worst possible way that brains could solve the problem of assigning causality of action X to outcome Y. Dopamine is released in huge quantities and in many places in the brain at once. If you wanted a way for the brain to highlight just one particular connection between a set of neurons — say, just the connection between neurons for action X and for outcome Y — then this would seem to be a terribly inefficient way to do it.
But in truth, it’s very clever. Dopamine is a broadcast signal. Rapidly and simultaneously, it tells huge regions of the brain, “Something really interesting just happened outside Brain. Which of you folks is gonna take responsibility for dealing with this?”
And what it broadcasts is surprise. Surprise is the error in the brain’s prediction about what should happen next. We have a mountain of evidence that dopamine neurons broadcast an error about the prediction of reward. When your brain is predicting that nothing rewarding will imminently happen, and then you are handed a doughnut by a total stranger, your dopamine neurons burst, briefly. They signal to the rest of the brain the surprise that something unexpectedly good happened. They are saying, “Hey, whichever of you guys caused that doughnut thing to happen, do it again!”
But prediction error is not just for reward. We also know that dopamine neurons signal errors in predicting bad outcomes and things you want to learn to avoid, like not pressing a button that drops snakes into your bathroom. And they signal errors in your estimate of how much time has elapsed since an earlier event. And they signal an error between what you intended to sing and what you actually sung (yes, you have a music critic in your midbrain — didn’t you know?).
All these ways that different errors in the world cause a brief burst of dopamine can be neatly explained by the single idea that dopamine neurons signal surprise. And, crucially, this brief burst of dopamine always happens very fast after the surprising event Y. This burst carries a time stamp of something surprising happening, right now.
So your brain has detected a cool new thing that happened in the world, and dopamine is broadcasting it to the rest of the brain. Now your brain needs to discover if any of your actions caused that outcome, and then glue the outcome to the action by strengthening the link between them, and only them.
To do that, your brain needs to find the representation of the action(s) that occurred before the representation of the outcome. Causality does, after all, seem to have only one direction. It seems unlikely that the special hopping-on-one-leg-while-twirling-a-chicken dance you did after the light came on caused the light to come on. Rather, it was probably the light switch you flicked while coming into the room (with your chicken-free hand, obviously).
That brief broadcast of dopamine across the brain is essentially searching for a trace of actions that occurred in the very recent past. When a neuron’s electrical pulse goes shooting down its axon cable onward to convey its message to all its target neurons, it triggers a long-lasting process inside the neuron, where various molecules slowly change in concentration — especially calcium. Moreover, activity at any of the incoming connections to that neuron also leaves a trace of calcium behind at that connection, tagging that input as potentially important for making the neuron active.
Now dopamine turns up at the connection between two neurons. Say one neuron drove the action that caused something, and the input to that neuron from another neuron says, “I was here.” The connection between those two neurons would then encode the information: “Do this action when I am here.” If the “I was here” neuron had just caused the “do this action” neuron to fire, then the action neuron will have those traces of calcium inside it — one indicating that the connection was active and one that the action neuron fired. Dopamine allows the connection between these two neurons to increase in strength only if these traces exist. So the idea to “do this action when I am here” is strengthened if and only if the two neurons were active at the right time.
Even more remarkably, it seems the brain builds causality right into the rules for how each individual connection between neurons should change. A connection from neuron A to neuron B seems to keep a tally of the order in which neuron A and neuron B fires. If neuron A fires shortly before neuron B, then it could logically have caused neuron B to fire. That connection is marked for possibly increasing in strength.
But if neuron A fires shortly after neuron B, then it could not possibly have caused neuron B to fire. That connection is marked for possibly decreasing in strength, as, if anything, the firing of A is interfering with neuron B. If neuron A fires a long time before or after neuron B, then the connection does not change in strength. Indeed, the rules of changing connection strengths seem to be built specifically for learning causality.
And so the brain solves the credit assignment problem. It finds the action X that caused outcome Y by broadcasting a time-stamped signal that something surprising just happened outside Brain, and that broadcast signal only has an effect where it finds traces that an action neuron had just been made active in the right place. Now, when you’re in that place again, the neurons for action X are more likely to be active — you’re more likely to do action X again. Thus, we discover whether X does indeed cause Y. And we have a better model for predicting the world.
The agency problem is now simple to solve. How does the brain know it wasn’t you? When the dopamine surprise signal turns up and there is no trace of activity in the action neurons. No trace of activity means “I did not cause this.”
(Okay, so there might be the odd occasion where, by chance, action neurons are active just before an outcome but did not cause it. That is why the action needs repeating: If action X is intentionally repeated and does not cause outcome Y, then there is no evidence of a link between them.)
Working out how the brain learns causality is both at the forefront of modern neuroscience and a submarine subject. Largely submerged, elements of a theory of causality are spread throughout the literature but are rarely named as such. Which means it is an area ripe for exploration, rich in potential, full of unanswered questions. Let’s take one unanswered question as an example: What about wanting to use this newfound information in the future?
The learning of causality is based on the idea that we carry around a predictive model of the world in our brain. But if we do, then we likely also carry around the inverse model—of how to change the world. We need to be able to say, “I want outcome Y,” and use the inverse model to look up “action X” that get us the outcome.
This means we need to update two models: a predictive model (“the world should be in this state”) and the inverse model (“to get the world as I want it, I need to do X”). It seems likely that dopamine teaches both models. But where? And are they updated at the same time? We have no idea. How many different models the brain makes about the world, how they interact, and how they are taught are deep questions with no answers.
Learning causality by trial and error is seen across species, animal and bird alike, painstakingly knitting together the sequence of events that Y follows if I do X. Some species can learn causality by mimicry. Blue tits can learn to open milk-bottle tops by observing other blue tits doing it. (Seriously, don’t piss off blue tits.)
But the human advantage is language. Language means we are no longer bound to painstaking observations of local, personal chains of events. We can explain cause and effect in language and pass them on in abstract forms: books, magazines, TV documentaries — and 300-part YouTube videos on rebuilding a V8 engine. We can write down our observations and leave gaps where we are missing steps in the chain between X and Y (we call this “science”). We can share information and find causality in larger samples on larger scales than an individual could ever hope to learn in a lifetime.
The mere fact that we know about the causality of extinctions or of climate change is a testament to our ability to learn beyond the local, personal effects of our actions. Uniquely, our brains can learn not just “I caused this,” but also “we caused this.”