The art of looking (through the eyes of a computer).

As an art historian, it has never occurred to me that I would care about what a computer “sees” when it looks at art. I mean, I’ve spent many years training my own eye. But I’ve been working with Shelley Bernstein on an online collection project at the Barnes, and part of what we’re doing involves using a machine to recognize content in works of art. It has been fascinating. The idea is that the machine should be able to identify the basic elements in, say, a Cézanne — to tell us that the painting contains rocks and houses and trees. Shelley wrote in her last post about times when the machine has been totally off — like when it identified Matisse’s famous Joy of Life, which depicts a timeless arcadian landscape, as “graffiti on a wall,” or when it interpreted a nineteenth-century painting by Renoir as “two girls taking a selfie.” Many of the other Renoir figures in the collection were identified as “stuffed animals.”

The computer sees Matisse’s Joy of Life (BF719) as “a graffiti covered wall.”

That last one gave us a good laugh. But after a few minutes of head-shaking, I realized that the computer’s “stuffed animal” categorization was actually kind of awesome because it seemed to support an idea I’ve been developing. I’ve been working on an essay about Renoir’s obsession with the sense of touch, which I’m trying to link with his desire to revive artisanal values during the industrial era. A big part of my argument rests on proving (to the extent this is possible) that Renoir was deliberately trying to evoke the sense of touch in his paintings of fleshy naked women. So discovering that the computer was seeing teddy bears — soft things — was good news.

Renoir’s Bather (BF71) is seen by a computer as “a closeup of an animal.”

There was also something intriguing about the simple fact that the computer was seeing something different than what we were seeing. Renoir’s nudes are always described by art historians as being highly sexualized and explicitly marked by signs of gender — and yet the computer wasn’t even reading them as female bodies. Instead, they were inanimate, sexless objects. It was like the computer was perceiving something we couldn’t. Free of the baggage we bring to looking at the nude — certainly it couldn’t have any feelings or expectations about the subject — the machine seemed to be registering pure visual data. Here was an eye without a body, an observer unburdened by pesky senses or preconceptions. And there is something compelling about this. Monet longed to be able to see as if he had no prior perceptual knowledge about the things he was seeing. He said he wished he had been born blind so that he could one day regain his sight and just paint what he saw before him, matter-of-factly putting the physical forms he observed in the world onto canvas.

Renoir’s Young Mother (BF15) interpreted by a computer as “a boy holding a teddy bear.”

The notion of an objective computer eye is a fallacy, of course. A computer’s vision — how it identifies what it sees — can be no more innocent than that of the humans programming it. With the Microsoft Computer Vision API, the computer is trained to recognize content through the consumption of thousands of photographic images — and because the people and things in those photographs are named and categorized by humans, there is an inevitable duplication of the biases we bring to looking and of the binaries organizing Western thought. When it comes to identifying male and female, for example, the algorithm operates according to the most traditional (and historically constructed) visual signs. This is a whole other topic with huge implications for all of us concerned with documenting and interpreting visual data.

Renoir’s Reading (BF107) seen by a computer as “a man and a woman taking a selfie.”

But perhaps there is a way in which computer vision is objective — or maybe innocent is the better word — in that it doesn’t know the “appropriate” kinds of art historical comparisons to make. It doesn’t know that it makes no historical sense to see the Matisse painting as a wall of graffiti and that Renoir couldn’t have painted people taking selfies. These misreadings remind us that we constantly project our own meanings onto works of art; but I would go even further and say that they might actually be useful for the field of art history.

The Renoir teddy bears was a case when the computer saw something that supported (however tenuously) what I was already thinking. But what about when it reads the work of art as something that you never would have anticipated — when it perceives something that actually makes you look at a familiar object in a totally new way? Think about the Matisse painting as a wall of graffiti for a minute. There’s some reason the computer was understanding it this way. Maybe, for some Matisse scholar out there, mulling over this misreading becomes the spark for a new theoretical framework for interpreting the artist’s work as a form of “writing.” These weird misreadings could stretch the brain, and this is always good for art history.


The Barnes Foundation collection online project is funded by the Knight Foundation and our code is open source. Follow the Barnes Foundation on Medium.