Learning Objectives for General AI

Derek Hoiem
Vision of Seeing
Published in
4 min readJan 2, 2020

Despite fantastic progress in machine learning methods to perform specific tasks, more general AI systems that conform to our intuition of intelligence still evade grasp. As we start 2020, long anticipated to be the the year of flying cars and robot monkey butlers, perhaps a little speculation is in order on what learning objectives could yield more general AI.

Maximum Boredom. I mean, I hope you aren’t bored yet, but that’s the learning objective — the ability to integrate new information with minimum change to the current state. Minimize surprise and novelty. For my two month old baby, that means she tries to maintain a stable representation of her environment as she looks around or is carried. When she hears a sound, ideally it will not astonish or intrigue but will have been anticipated in some sense from the information gathered so far and, thus, will cause only small change to the representation. To the fresh mind, everything is novel, interesting, surprising, and funny. To the well-learned agent, most events are predictable and, well, boring. “Nothing is new under the sun,” said the wisest man ever. It may not be fun to be bored, but it means you have mastered your environment.

Learning to maximize boredom, e.g. minimize the information gain of each new event, is more general than label prediction. In an image labeling task, the system “observes” a random image and outputs probabilities of image or pixel labels. Then, it sees a different random image. The system’s life consists of many very short episodes and, while it never really understands what is going on, it does learn to anticipate the supervised labels. Such a system has no opportunity to maximize boredom because it experiences no continuity — each new input is of a brand new environment.

Maximizing boredom is also different than image generation. On the face of it, generating an image, e.g. from a caption or description or based on purely random numbers, seems like a vision task that should lead to general representations because it seems like a perfect generator should need to encode, well, everything. The problem is that images can vary substantially in uninteresting ways or vary slightly in ways that significantly changes the meaning. In maximizing boredom, the goal is not to predict exactly what will be sensed next, but to minimize the change of internal state that is updated when sensing it.

Maximizing boredom can drive supervised learning (anticipate label from image), self-supervised learning, cross-modal learning (e.g. information produced by sound, vision, and touch should be consistent), and learning from time series.

Even if maximizing boredom is an important objective for general learning, the question remains what is the internal state. The objective could be trivialized by too limited of a state that never changes because it doesn’t encode anything. A rock could witness a host of angels proclaiming the end of the universe without as much as a blink of surprise. A trivial solution is avoided if the representation also has other goals, for example manipulating caretakers into keeping you alive and zeroing in on that bulls-eye milk source. Or a competing objective that tries not to be bored, as in curiosity-driven learning.

Learn to update a stable internal representation that is “bored” by subsequent events, while also learning to stimulate surprising events: the basis of general learning?

More broadly, learning for a general AI requires the right objectives, the right environment, and the right process:

  1. Learn to be bored: learn to produce a representation of the current environment (not of the inputs!) that is stable under multimodal inputs over time. Ideally, an agent should acquire most of the important information about its environment from a few glances, while further exploration and audio/visual/tactile input yields small refinements on that representation. This objective can’t work on its own. It needs the curiosity and task objectives below to prevent trivial solutions like getting bored by failing to modify the inputs or by not representing anything.
  2. Act curious: learn to perform actions that challenge the boredom objective by experimenting with the environment
  3. Perform tasks: learn representations of the environment that enable completing tasks, such as locomotion, identification, making people smile, etc.
  4. Learn in a rich and continuous environment: without being part of a world that evolves and allows for action and interaction, it’s hard to imagine that an agent can use boredom, curiosity, and a broad variety of tasks to drive its learning.
  5. Learn in stages: Babies learn in stages. Montessori shows that mastering a tasks in a progression leads to better efficiency and better local optima (my own way of phrasing it) than trying to learn complex tasks too early. Perhaps boredom can be used to drive when it’s time to move to the next stage.

I’m interested in your thoughts. What are the learning principles that you think are missing in current practice? Do you think the principles (so far as we understand them) for human learning are important for acquiring human-like intelligence, or just part of our limitations as an organism?

--

--

Derek Hoiem
Vision of Seeing

Professor at University of Illinois at Urbana-Champaign. Chief Science Officer of Reconstruct.