

Work: Citrix Converged Infrastructure & Cloud PM/PMM | Workspace Cloud | Play: Charlie Parker | Giants | Warriors (for now)
In order for AI systems to generalise in diverse real-world environments just as we do, they must be able to continually learn new tasks and remember how to perform all of them into the future. However, traditional neural networks are typically incapable of such sequential task learning without forgetting. This shortcoming is termed catastrophic forgetting. It occurs because the weights in a network that are important to solve for task A are changed when the network is subsequently trained to solve for task B.
In contrast to discriminative models that are used for classification or regression tasks, generative models learn a probability distribution over training examples. By sampling from this high-dimensional distribution, generative models output new examples that are similar to the training data. This means, for example, that a generative model trained on real images of faces can output new synthetic images of similar faces. For more details on how these models work, see Ian Goodfellow’s awesome NIPS 2016 tutorial write up. The architecture he introduced, generative adversarial networks (GANs), are particularly hot right now in the research world because they offer a path towards unsupervised learning. With GANs, there are two neural networks: a generator, which takes random noise as input and is tasked with synthesising content (e.g. an image), and a discriminator, which has learned what real images look like and is tasked with identifying whether images created by the generator are real or fake. Adversarial training can be thought of as a game where the generator must iteratively learn how to create images from noise such that the discriminator can no longer distinguish generated images from real ones. This framework is being extended to many data m…