Beyond machine vision: how can AI generate realistic images?

Center for Data Science Prof. Kyle Cranmer talks GAN and VAE

NYU Center for Data Science
Center for Data Science
2 min readMar 24, 2017

--

AI-generated images of galaxies (left, lower of each pair) and volcanoes Left: Figure: S. Ravanbakhsh/data: arxiv.org/abs/1609.05796; Right: Nguyen et al./arxiv.org/abs/1612.00005

There are currently two main approaches to generating images using artificial intelligence, Generative Adversarial Networks (GAN) and Variational Autoencoding (VAE).

GAN pits two neural networks against one another in order to improve their generation of photorealistic images. In GAN, there is a generator which produces fake images, and a discriminator, which differentiates the fake images from the real ones. They train together: the discriminator processes the variations between the real and the fake images and informs the generator on how to produce more accurate fake images. Ideally, as the GAN process continues, the generator will eventually produce better and better fake images, so that the discriminator can no longer tell the difference. In terms of efficiency, GANs also benefit researchers because they require only hundreds of training images whereas image-recognition networks require tens of thousands, according to its creator Ian Goodfellow, a computer scientist at Open-AI in San Francisco.

VAE, however, is another approach to image generation whose strong suit lies in the diversity of images it creates. Albeit with lesser quality, some researchers have combined VAE and GAN into a hybrid with the goal of creating an improved generative model.

These generative models have come a long way. As the Center’s Professor Kyle Cranmer recently explained in an article for Nature, a gap exists between the theory and practical engineering of neural networks. Currently, neural networks produce sound results, but how they do so is still a “black box.” Neural networks initially involved an input and a prediction. But now, generative networks produce images of dogs, cats, or galaxies that look real, suggesting their understanding of generating real world representations has improved, thereby easing some of the apprehension scientists have had about their ‘black box’ nature.

The photorealistic images that generative models create are immensely beneficial for scientists and researchers who need to perform image reconstruction, or to fill in deformities in images through simulation. The applications of this technology, as Cranmer remarks, are “pretty endless.” More broadly, generative networks can also be used to train image-recognition software because the way a neural network eventually recognizes and classifies data alone is through its training process with an original data set that cues its virtual neurons.

by Nayla Al-Mamlouk

Originally published at cds.nyu.edu on March 24, 2017.

--

--

NYU Center for Data Science
Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.