Art & Robots

MIT — Data to AI Lab
Data to AI Lab | MIT
5 min readAug 5, 2020

Generative adversarial networks and their creative potential.

by Peter Suechting

Initial appraisals of the portrait Edmond de Belamy placed its worth somewhere between $7,000 and $10,000. Indeed, the painting itself is nothing special. In muddy swatches of brown, black, white, and gray, it depicts a rather blurry man staring bleakly and confusedly out of the frame. But on October 25th, 2018, when the dust settled in Christie’s New York auction house, the portrait had sold for $432,500.

The art in artificial

What drove such a frenzy of bidding on behalf of the participants present that afternoon on the Christie’s auction floor? The fact that it had been “painted” by a machine. The Belamy portrait series is the work of Obvious, a collective of three French artists and programmers, that produced it by training a generative adversarial network, or GAN, on a dataset of 15,000 European portraits.¹ The output? Eleven unique portraits, including Edmond de Belamy, with features informed by those of the training set.

Image courtesy of Christie’s.

The project portrays AI artwork as potentially of equal merit to human-produced art. After all, Obvious essentially copied their favorite algorithmic output result using an inkjet printer, framed it in gold, and signed it with a fragment of algorithmic code. There’s nothing special about the object itself; it could be printed out and framed a thousand more times. By this logic, the bidding war over the Belamy portrait appears to be about owning a piece of history in the making: the moment machines became artists themselves.

Machines create, but can machines be creative? This is the question prompted by the emergence of GANs. For programmers like Obvious, GANs seem to represent a powerful means to purposefully intervene in the discourse about authenticity. As such, the Belamy project is ultimately part of a broader swell of debate around the definition of creativity, and whether humans can lay claim to being its exclusive practitioners.

To err is generative

Since their inception in the mind of Ian Goodfellow, GANs have been deployed in a broad variety of “creative” tasks. According to Goodfellow et al.’s 2014 paper, generative adversarial networks (GANs) train two models — a generator, and a discriminator — at the same time. An original dataset (images, text, etc.) is fed into both algorithms, each of which pulls the data apart and assesses how its components are related to one another. This is a typical application of neural networks. But what comes next is key to the process. The mythology of the moment has Goodfellow posing a simple question: “what if neural networks could compete with one another?”

Within GANs, the adversarial network framework is also linked to an operation called “back-propagation,” which is a critical component of the learning process. The Skymind.AI Wiki analogizes,

“… a neural network to a large piece of artillery that is attempting to strike a distant object with a shell. When the neural network makes a guess about an instance of data, it fires, a cloud of dust rises on the horizon, and the gunner tries to make out where the shell struck, and how far it was from the target. That distance from the target is the measure of error. The measure of error is then applied to the angle and direction of the gun (parameters), before it takes another shot. Back-propagation takes the error associated with a wrong guess by a neural network, and uses that error to adjust the neural network’s parameters in the direction of less error.”

Back-propagation is similar to human cognition in this way. An artillery operator, firing off a few shells to test and refine his range and aim before zeroing in on the input parameters that drop shells in the target area, learns about what’s working, what needs adjustment, and automatically makes those adjustments on each “shot.” But whereas the human artillery operation must rely on his own eyes to ascertain the degree to which he has succeeded or failed, GANs have the discriminator algorithm, which tells it whether or not the simulacra it’s been passed are discernibly fake. Additionally, a machine can fire off thousands of “shots” in the time it takes a human to fire once. The iterative power of machine calculation thus allows both networks to quickly co-evolve.

In sum, back-propagation creates an environment in which a generation algorithm is quickly and effectively trained to a very high degree by competing against a similarly trainable discriminator algorithm. The generator algorithm is effectively unbounded in the degree of skill it can attain relative to the discriminator. Through this process, fakery asymptotically approaches reality to the point that discerning between the two is no longer possible, even for machines. So, after proper training has progressed, the generator algorithm becomes capable of producing simulacra that can fool the digital eye of the discriminator, and consequently, the human eye observing the result.

Remixing reality

There are numerous examples of GAN-based tools in recent years, ranging from well-known and predatory applications like “deepfakes” to more enlightened applications like DeepDream which developed GAN-based tools for generating haunting, hallucinogenic imagescapes from real photo inputs. Project Magenta, a collective of Google-affiliated researchers, develops and hosts a library of GAN-based tools for augmenting digital music production, providing another fascinating application of the technique to audio manipulation and enhancement. (You can hear the results of Magenta’s collaboration with LA-based band YACHT on the latter’s September 2019 album here. And below is an example of video interpolation using GAN-based tools to project bodies onto movement, presented in a recent paper by Caroline Chan.)

“Do as I do” motion transfer by Chan et al. 2019.

But lest we think that AI researchers spend their time attempting to put already-starving artists out of business, these artistic applications are not ready to compete with the grand masters, yet. They remain tools to be employed to varying degrees of usefulness by humans. Artistic applications aside, however, GAN-based tools are making inroads into other realms, a topic explored in other posts.

For the past two years, researchers at MIT’s Data-to-AI Lab have been experimenting with using GANS to solve a number of interesting and societally-relevant problems, ranging from passing (and detecting) secret messages and protecting videos via watermarking to identifying anomalies and preserving privacy. The DAI Lab is excited to share some of our more exciting projects with the machine learning and data science communities with the hopes of sparking discussion and collaboration. To that end, this article is the first in a multipart series on the lab’s GAN-related projects, to include demonstrations of our work, tutorials, commentary essays, and open-source libraries. We plan to release these articles bi-weekly, alternating with a series of tutorial posts devoted to several of our recent GAN-related projects. Check back for more soon on our Medium page, and please follow us on Twitter.

[1]: Debate persists as to the origin of the code that Obvious used to produce the Belamy portrait. Programmer Robbie Barat , an AI artist unaffiliated with Obvious, is responsible for, by some accounts, 90% of the code used in the project, although he had originally posted the code as part of the creative commons.

--

--

MIT — Data to AI Lab
Data to AI Lab | MIT

Research lab at MIT focusing on developing data driven artificial intelligence applications.