Learning in Badger experts improves with episodic memory

GoodAI
GoodAI Blog
Published in
3 min readOct 23, 2020

By Simon Andersson
GoodAI

Summary

  • AI agents need to use past experience to select the right actions
  • Simultaneously learning to remember the past and to use the experience might overwhelm current recurrent neural network (RNN) architectures
  • Our distributed Badger architecture, where each agent is composed of interacting experts, faces the additional challenge of learning to communicate internally inside the agent
  • Augmenting experts with episodic memory, dedicated to recording observations and internal states, is a possible way to make learning more tractable
  • Our experiments suggest that episodic memory can improve accuracy, sample efficiency and learning stability in single- and multi-agent settings

Badger architecture and memory

AI agents often operate in partially observable environments, where only part of the environment state is visible at any given time. An agent in such an environment needs memory to compute effective actions from the history of its actions and observations. The agent, then, is faced with the difficult problem of simultaneously learning to maintain a representation of its history and to compute the right actions from it.

In the Badger architecture, an agent consists of many smaller communicating agents, called experts. This adds an additional component to the learning problem: experts need to learn to communicate while also learning to remember and act.

In most agent architectures, the memory is maintained in the activations of a recurrent neural network (RNN), the weights of which need to be learned. The training of RNNs presents an additional challenge. The activations of a recurrent network are produced by the repeated multiplication of its weight matrix, and the error signal backpropagated through time can grow or decay exponentially. This is the familiar problem of exploding or vanishing gradients.

We implemented an episodic memory neural network model and found that the memory can improve results on sequential classification (single expert) and a decision task with an agent composed of multiple experts. In the latter experiment, each expert was extended with its own episodic memory.

It is possible that current RNN architectures, such as long short-term memory (LSTM) and gated recurrent units (GRU), are insufficient for providing the memory for experts. Can we expand the learning capabilities of experts by equipping them with better memory? We are doing some experiments to find out.

A promising idea is to extend experts with episodic memory, allowing them to store and recall past memory states. In meta-learning and similar settings, where an outer loop trains an agent to perform learning or inference over the multiple time steps of an inner loop, the memory may help both to speed up convergence in the inner loop (since observations, once encountered, are retained in memory) and in the outer loop (since the inner loop task is now easier).

Read the full article here.

Collaborate with us

If you are interested in Badger Architecture and the work GoodAI does and would like to collaborate, check out our GoodAI Grants opportunities or our Jobs page for open positions!

For the latest from our blog sign up for our newsletter.

Originally published at https://www.goodai.com on October 23, 2020.

--

--

GoodAI
GoodAI Blog

Our mission is to develop general artificial intelligence — as fast as possible — to help humanity and understand the universe