The Link Between Sleep and Deep Learning

Photo by kevin laminto on Unsplash

How long can a person go without sleep? The world record is apparently 11 days. However, when Randy Gardner set that record in 1965, he may have been awake during the time, but he was basically ‘cognitively dysfunctional’. If perhaps Gardner went beyond two weeks he would likely have died.

Some animals appear to be awake all the time. Whales and dolphins (i.e. Cetaceans) need to remain awake because they need to periodically come up to the surface to breathe oxygen. Cetaceans keep awake by allowing only one half of their brain to sleep at any one time.

The cognitive purpose of sleep is an open question. Recently however, scientists have a new conjecture (“How memory replay in sleep boosts creative problem solving“) as to the purpose of two important phases of sleep.


According to the research, the brain goes through several 90 minute cycles of REM and Non-REM sleep. Non-REM sleep involves the sequential replay of acquired memories. In contrast, REM sleep involves a more random associate game involving disparate memories. In deep learning, this is analogous to the search algorithms of optimization and exploration respectively. That is, we reinforce our memories while in non-REM sleep and we imagine new novel associations while in REM sleep. Our brains are alternating between optimization and exploration while we sleep. In effect, we are learning while we are sleeping.

It wouldn’t be such a waste if we were learning while sleeping.

Coincidentally, it has been suggested that naps of less than 45 minutes enhances “creative thinking”. These naps allow REM sleep to kick in and avoid the slow-wave sleep that can lead to grogginess and disorientation (See: “Napping: The Experts Guide”). The hack is to drink a cup of coffee before napping so that the caffeine kicks in at the 45 minute mark.

This idea that learning happens while we aren’t awake may come to many as a complete surprise. Learning through sleep is a common experience by many who have trained in music. There comes a point where skills don’t improve during a day of practice but only the next day. In a Northwestern study (“Learn that tune in your sleep”), a musical tune was played during slow-wave sleep that reveal enhanced memorization of the tune. In a German study (“Boosting Vocabulary Learning by Verbal Cueing During Sleep”) subjects improve German-Dutch translation while being exposed during non-REM sleep. This “sleep learning” maybe a key idea in unravelling the relatively poor generalization capabilities of existing deep learning architectures.

David Ha and Jurgen Schmidhuber have a recent paper “World Models” where they describe a system that learns by “dreaming” or “hallucinating” about previously acquired memories. In the paper, the authors describe learning a compact representation of the world (via an autoencoder) and then recreating this compressed environment to learn to improve a policy function required for driving future behavior. One take away of this paper is the effectiveness of learning behavior from rough approximations of previously acquired observations.

Reinforcement learning has been difficult to scale as a consequence of expensive sampling costs. In other words, a learning agent has to interact with the world (i.e. environment) a multitude of times to be able to gain the appropriate reinforcement to learn. However, if an agent is able to create a ‘mental model’ or a ‘simulation’ of the world then it should be able to do this sampling in a much more cost effective manner. To implement this, one either needs a replay memory, a good generative model or both (please search the Arxivs and let me know if you’ve uncovered the last kind).

Ultimately one can look at achieving intelligence as not just the ability to compress models of the world and make predictions. But rather, the metric of intelligence should be ‘sampling efficiency’ rather than just compression. The problem with relying only on the idea of compression is that it brushes away the requirement for automation at higher levels of the Chomsky hierarchy. That is, one can create good compression algorithms with just finite automata. However, intelligence likely requires a Turing machine (see: Universality) and models that are good representations of the world also require Turing machines to construct and interpret (Thanks to Hector Zenil for this insight).

The newest developments in the field of deep learning are now beginning to explore the creation of internal ‘imagined worlds’. This goes in line with my earlier claim that embodied learning is essential to artificial intelligence. However the only real way of scaling both ‘sense making’ and ‘sense breaking’ is to reserve ample time during the day to sleep on it. A blindspot in conventional reinforcement learning is the false assumption that biological systems can learn only when they are awake.

Further Reading

Explore Deep Learning: Artificial Intuition: The Improbable Deep Learning Revolution


Exploit Deep Learning: The Deep Learning AI Playbook