Are Videogames Bad at Images?

And what should we do about it?

Jonas Linderstroem

If there’s one thing that I don’t think videogames are very good at, it’s producing beautiful images. This isn’t because there is anything wrong with videogames as a form, but because I think a lot of videogames just don’t prioritize or think very much about imagery in how they seek to communicate ideas to gamers. I think the struggle to be visually meaningful is one of the more significant challenges facing videogames, and I think they’ve been falling short compared to pretty much every other kind of media relating the visual arts, not just film and television, but theatre, photography, architecture, design arts, studio arts, and comic books. I just don’t think videogames, in their ~40 year history, have been as visually powerful as they should be.

It’s too rare when I play a videogame that I see something on screen that’s really striking and awe-inducing. It’s also too rare that a videogame feels engrossing and resonant for me not just because of the nature of its feedback, or the stakes underlying its game systems, but primarily, or at least significantly, because of what is actually being shown on screen and how it’s being shown on screen. In fact, I’d go further and say many videogames I play are actually quite boring to look at. One of the things I enjoy most about films and video art are how deeply emotionally affecting their images are, but as a lifelong gamer, I’m consistently disappointed in how the images of gaming have become so dry.

William Klein, “Gun 1”, 1955

I’m a firm believer that videogames are very much a visual form of media, very much a member of the visual arts. This doesn’t mean that I think every single videogame is primarily visual, but I do think that the art history of videogames is very much visually oriented, if not visually leaning. And I guess I’ve always found it disconcerting how reluctant people often are to acknowledge this, that visuality is not only complicated and nuanced, but a very large part of what videogames are and have always been.

What makes something visually powerful, visually substantial? The first thing I should say is that visual meaning isn’t about looking good. Videogames mostly look great, and always have. They run under cutting edge technology and enormous, endlessly complex software. But none of this has anything to do with being visually meaningful. It doesn’t matter how detailed and intricate its textures are, how realistic the lighting is, the motion blur effects, or how colourful it is or isn’t. All of these things are nice, but they aren’t prerequisites to visual substance.

Visual meaning also isn’t about information. I think we’ve done this thing where instead of embedding visual substance into our work we instead focus on environmental storytelling, which as far as I’m concerned is just another, more efficient way of embedding lore into a game. Environmental Storytelling has no inherent visual substance because it’s primarily a means of relaying information. If I go into a room in a house and I find an empty, battered kitchen with empty beer bottles lying on its table, then I can concur that the person who lived there was likely a heavy drinker. If I go into the next room and find kids toys then I can deduce that a family lived here and that there may have been tension in the home. The scene can tell a story, or multiple stories, but it has no inherent visual meaning, it just may or may not be interesting because of the information it relayed to you. You could have also read this inside a text box, but it’s done in the game world because it could have felt more cohesive, immediate, or efficient.

If I’m playing System Shock 2, and I see a pool table with pool balls on the tables and sticks lying around, then I now know that people were once here and played games together. Again, it’s not inherently interesting to look at. Unless there’s anything intentionally done with the scene then it’s actually quite boring, because it’s just a pool table. If nothing is done with the scene visually then it’s not going to have any visual power.

Nier Automata (2017) is a great example of powerful imagery that also has deeper meaning

There’s this sense that Environmental Storytelling is an ultimate solution that can solve any problem. A technique of techniques that inherently occupies every space of videogame communication: mechanical, visual, and textual… but this overstates its ability as a storytelling device, and videogames that are telling us that they’re covering every base in one easy swoop are rarely doing so at all, at least in the visual department. What I like about the walking sim of walking sims Dear Esther was that while it was a story that unravelled as you explored the island, the setting itself was beautiful. The unceasing cloudy sky was lighting the game’s cold, desaturated palette. And the setting of an island off the coast of Scotland, with an ocean horizon in every direction was really driving home the sense game’s sense of loss and melancholy. And it produced beautiful images, it wasn’t just meaningful because of the plot, it was also visually meaningful and visually powerful at every step.

A scene or an image can be visually meaningful when it holds power in itself. A videogame can be visually engaging not just because of the information that it’s giving you is engaging, like what your current score is, or where the enemies are on the screen, or whether you’re winning or losing, or the items in your inventory, or what items you can craft or how high or low your health is… if these things are the only reasons you enjoy looking at the screen then that doesn’t mean the videogame is actually visually engaging, and if the only thing that’s visually engaging about the game is that it has neat graphics, then that doesn’t mean the videogame is visually meaningful in any way.

Looking for Beautiful Images

Stephen Shore, “US 10 Post Falls, Idaho,” 1976

What is an image? An image is just a visual composition. In the most abstract way, it can be really be anything that you look at. Images are a system of meaning in equivalent to a piece of literature or a game system. Images – pieces of visual meaning – are complex and nuanced with a long history. And an image is “written” in a visual language that’s just as dense as any other:

“Indeed, rather than the notion of looking, which suggests a passive act of recognition, we need to insist that we read a photograph, not as an image but as a text.”

A beautiful image isn’t something that can be easily defined but for me, a beautiful image captures a certain humanity that isn’t as easily expressed into words. It can be a weightful expression, or someone’s body language. It can be a symbolic contrast between ideas. It can be compositional, like a sprawl of city streets and signs, the complex weaving of a dense crowd, or the scene of a calm lake deep in the countryside. It can be the eeriness of an empty gas station at night, or the energy of a crowded gospel sermon. It can be more abstract, like an interesting visual pattern, or an attractive sense of colour. It can come from how it looks very symmetrical and cohesive, or asymmetrical and incohesive. Whatever it is, what connects beautiful images together is that they capture a moment in time, and they give that moment — that idea — more weight than it would have had if we weren’t consuming it as media. Beautiful images amplify select pieces of human existence. And they make reference to that existence in some kind of relevant way. However abstract or not-abstract it is, we can often look at a beautiful image, and see something familiar in it, something that relays back to us. A beautiful image often doesn’t need very much context for it to be powerful and resonant. We see what it is and understand what it is within our first glance.

Henri Cartier Bresson, “Children in Seville, Spain,” 1933

One of the things I’ve realized about videogames is they struggle more than other media to do what I would call a sort of… emotional telemetry. In a lot of good art and entertainment you can get a sense of what something feels like almost as soon as it happens. There’s a very natural translation of emotion from the work to the viewer, whether it’s high-level emotions like if a scene is dark or sad and dramatic or funny and comedic, but also when it involves more nuanced lower-level feelings, feelings that are more in the details of a scene. It can feel melancholic, or nostalgic, or feel surreal and mysterious. A scene can embed a certain kind of mood that’s difficult to describe but the nonetheless can completely define what that media is for the audience.

By contrast, videogames often have to work much harder and spend more energy in telling you what something is supposed to feel like. I play a lot of games where the emotional signalling is very blunt: like okay, they’re playing sad music now so this is the sad part that’s supposed to feel sad, and ok it’s focusing on them holding hands and smiling so I know this is good and heartwarming. It’s very compartmentalized, the way this is carried about. And I think if videogames struggle to communicate those more nuanced and complex emotions, in a way that feels smoother and more natural, then it’s very much because the culture and context videogames are made in is not one that is bent towards understanding and appreciating imagery.

On West 46th

Here is a counterpoint: why careabout any of this? Videogames are built on the most complex and advanced technology in entertainment media. What can be taken from something as primitive and simplistic as a photograph?

Moreover, videogames are massive endless interactive mega systems that are constantly changing and shifting. Unlike the “old arts’’, which are static and unchanging, and whose content is easily consumed in one sitting, a videogame can be consumed in all sorts of ways, and is never played the same way in each playthrough. How can something that is so dynamic have any important relation to media that plays through the same way every time, some of which don’t even move?

Joey Meyerowitz, “Broadway and West 46th Street, New York”, 1976

Here there’s a misunderstanding of how people consume media and why we even consume media in the first place. Images, including photographs, are everywhere in our society, and they’re more prevalent than videogames, which are mostly marketed towards a niche audience that carries specific investment and specialised knowledge. If photographs and images were just boring static objects that had no dynamicism, then we would have discarded them a long time ago. But we didn’t, because media doesn’t have to be physically changing to be dynamic. A photograph doesn’t have to actually move to have a sense of movement, to feel like it’s moving. A video doesn’t have to be immediately responsive to feel close, to feel personal, and to feel like it’s existing in relationship to how we consume it.

Let’s consider for a moment the amount of work and labour that would be needed to recreate Broadway and West 46th in a videogame. The amount of assets that would need to be made from scratch, the modelling and the texturing of the individual signs, the modelling of the buildings and their specific materials, the modelling of the crowd and the creation of their costumes, not to mention the rigging of their animations and walking paths, all which need to look interesting and natural and ‘human’ when they’re all put into the same space and when they’re interacting with each other. And that doesn’t include the lighting design: the more powerful your target machine, the higher your draw distance is and so the more buildings you have to model and texture to make the city look not cheap and fake. If you decide to lower the draw distance to reduce your workload then it looks like Spiderman on the Dreamcast. Of course, you would probably just progressively load your textures, or you can make less detailed textures and images the further back it is, hoping the player never gets too close. But this gets harder to justify into the 8th generation of super-consoles.

The gap in manual labour that is needed to match the feeling that Meyerowitz was able to capture with a single snap of his camera, is exponential. But if we take up the lessons of imagery, then we realize that none of these things are actually necessary to make you feel like you’re in a city. The thing about imagery, is that it’s never really about the thing itself. It’s more than knowing Broadway and West 46th takes place in New York that makes it feel like New York. It’s the low horizon line, the long draw distance into the sky, and the lack of clear and easy see-through lines within the chaos and incohesion of the crowd. The photo is drowning in street signs and billboard ads — it’s so cluttered as to give you no sense of what this street actually looks like. It’s messy and difficult, and so it communicates Inner Manhattan as the same: clogged, messy, difficult to read and understand, a colourful but very dreary collage of suffocating corporate imagery. It’s the visual composition of the image that makes New York feel like New York.

Focusing Our Perspective

Manuel Alvarez Bravo, “Sed Pública (Public Thirst),” 1933

Videogames are interactive, which really just means they respond to input. It’s nothing fancy when you think about it. You press a button on your controller or your keyboard, and something happens on the screen. It can really be anything, or nothing at all, if that’s the point of it. But it’s that reciprocal relationship, that defines how we understand them.

If videogames struggle with meaningful, powerful imagery, it’s probably because our culture has never been able to recognise the form’s unique visual language that’s distinct from a technical visual language. We know videogames have a language for telling players what to do, and giving them instructions, but what’s our visual language for communicating relevant ideas, however abstract, and making gamers actually feel things? Where are those lessons and those histories within our shared dialectic?

A videogame happens one image at a time. And we consume videogames one image at a time, one visual moment at a time. Reading Michel Ancel’s concepts for Beyond Good and Evil 2, I’m reminded that a culture which has no sense of visual language is one that equates being ambitious with just making bigger and more complex simulations that can more efficiently route players around. This is the future I envision when in a culture that doesn’t care about imagery. We create more simulations, but meaningful communication of strong ideas with the consistency of other media will always elude us. It’s going to be an increasingly important project for writers of videogames going into the next decade, as graphics technology starts to plateau and pushing CPU power takes more focus, to better define what videogame imagery is, and seek out the key works throughout our shared art history that can be shining examples, and guides for stronger artistic work in the future.