TroublingGAN

Lenka Hamosova
AIxDESIGN
Published in
13 min readJun 15, 2022

--

_________________________________________________________________

PSA: It is a key value for us at AIxDESIGN to open-source our work and research. The forced paywalls here have led us to stop using Medium so while you can still read the article below, future writings & resources will be published on other platforms. Learn more at aixdesign.co or come hang with us on any of our other channels. Hope to see you there 👋

Creating knowledge through the alternative use of neural networks in artistic research

The current development of neural networks, which enable the creation of synthetic media and various creative AI tools, is largely influenced by research methods applied in computer science and the commercial goals of private technology companies. The use of these tools is therefore often limited to narrowly defined tasks that do not reflect the breadth of their creative potential and often do not take into account the disruptive consequences of their use. In this situation, critical artistic research can bring new, “softer” perspectives to otherwise overly technocratic thinking about the implementation of AI tools in everyday life.

In this essay, I want to show that neural networks and the current creative AI tools could be tools not only for artistic expression but also for artistic research. I will present this on the TroublingGAN project — a generative neural network that derives the essence of a “troubling times” and critically reflects on the characteristics of AI-generated synthetic media (especially its photorealism) and at the same time the contemporary use of journalistic photography.

Since the beginning of 2018, I have been working with AI models from the position of a speculative designer and artist, and this hands-on experience leads me to consider synthetic media and its place in art and media theory. The direction of my research stems from tacit knowledge that first emerges in free artistic experimentation with neural networks, followed by a phase of critical reflection. Artistic research acknowledges this type of knowledge, which emerges from doing and does not always coincide with the rational, “scientific”, replicable, numerically reducible, etc. In my work I am often guided first by intuition and then by my analytical mind. I feel, anticipate, perceive them on a physical level in the form of an emotion that moves through my body and affects my psychophysics. I observe the same thing in the group sessions — participatory workshops — that I do as part of my research. Participants often have great difficulty expressing this impression/feeling and actually legitimizing it for themselves at all.

One of the reasons for this illogical experiment, therefore, was the need for some validation of this hunch and curiosity in how neural networks, that resemble our own neural connections, will “perceive” and interpret such an abstract concept as “troubling times”.

visuals for the Uroboros Festival 2021, design by Lenka Hamosova, 2021

The background

The idea of exploring the concept of “troubling times” arose in the context of the organization of the second edition of the Uroboros festival (the festival of socially engaged design and art based in Prague). The Uroboros Festival brings together people from different disciplines to engage in workshops and other participatory formats to discuss our position as creators in the current era of intertwined problems arising from social and ecological challenges such as the climate crisis, growing wealth disparities and structural inequalities. At first, I wanted to create a generative tool to vary abstract backgrounds for the festival visuals, which would communicate the festival theme of ‘Designing in troubling times’ with their emotional charge. However, from what was originally a very utilitarian design tool, a much more extensive research arose.

The experiment consisted of the relatively simple task of training a custom StyleGAN model on a curated dataset. The barrier to do this is relatively low given the public availability of StyleGAN code and the abundance of tutorials on the Internet. Running a suitable graphics card in the cloud solves the problem of inadequate technical setup at home and reduces the time required for training. However, the real challenge is often the availability of suitable training datasets, or in this case the need to create your own dataset from scratch. I was intrigued by the question of where to get images to visually represent “troubling times”? And who will decide which visual representations are good enough to be in the dataset and which are not? Will it be me? Am I going to do it manually? In the end, I came to the conclusion that the best thing to do would be to try to find one collection of photos that were of the same “disturbing” quality — and that led me to the idea of news photos from 2020.

It was just as the legendary “disturbing” year of 2020 came to a close — a year that brought an unprecedented series of disturbing events and merited a satirical documentary reflection on Netflix. All of these events were meticulously documented in journalistic photographs and communicated around the world. The search for the perfect dataset ended when I discovered Reuters’ weekly gallery, Photos of the Week.

scrolling through TroublingGAN dataset, Lenka Hamosova 2022

The photographs document natural disasters, social unrest, political protests, military conflicts and an ongoing pandemic. Most of them do indeed depict disturbing events, except a few neutral shots, such as a close-up of a bird or a full moon. But there are not many of these (only about 1 in 300 photographs). I decided to leave them in so that the set of photographs could be analyzed as a whole. Despite the ambition to focus on the most diverse and representative collection of photographs, it should be noted that this dataset is not comprehensive and many particularly “invisible” social problems did not make it into the data for training a neural network using this method. This highlights the importance of an even visual representation of phenomena and the ‘invisibility’ of particular social problems or marginalised groups. Therefore, this project does not aspire to create any real representations, but is rather a thought experiment on the possibilities of generative neural networks.

Visual ambiguity of synthetic media

In 2022, the objective for visual synthetic media still remains photorealism. The new tool Imagen which Google released in late May 2022, uses the slogan “unprecedented photorealism × deep level of language understanding.“ It competes with OpenAI's DALL-E 2 in realism, and in the published research paper, the authors include human evaluation of the photorealism based on a comparison of these two models. However, is photorealism the only objective we should aim for? What other features do synthetic media have (or could have) that are not favored by the AI industry?

It is important to acknowledge also other characteristics of synthetic media, including the imperfections of early AI-driven media synthesis, as equally important. The visually ambiguous visuals that lack the photorealistic coherence, offer a much wider space for interpretation and it is this ambiguity that can have a wide application.

It was clear from the beginning that the outcomes from our model will not reach photorealistic levels because of the relatively small number of images in the dataset (approx. 1000 photographs) and the variety in portrayed scenes and compositions. However, we were not looking at this as a failed setup for an experiment, but rather an advantage. Removing the concrete details from the visuals could refocus our attention towards a finer elements of images and visualize hidden patterns from the dataset.

generated outcomes from TroublingGAN. Image by Lenka Hamosova, 2022

The new kind of knowledge produced by neural networks

If we look at neural networks through the perspective of an “instrument of knowledge” - then what type of knowledge do neural networks produce? Such an instrument of knowledge is composed of an observed object (training dataset), an instrument of observation (learning algorithm), and a final representation (statistical model). Vladan Joler and Matteo Pasquinelli use the analogy of optical media to explain how this ‘instrument of knowledge’ works: ‘the information flow of machine learning is like a light beam that is projected by the training data, compressed by the algorithm and diffracted towards the world by the lens of the statistical model’ (2020).

In our case of a neural network generating visuals, the observed object is a dataset, with two opposing learning neural networks (GANs) serving as the observation tool and the final representation being a trained StyleGAN model that can generate new visuals. Looking at this “magic” through these eyes, it’s hard to believe that anything new — any new knowledge — can come out of a statistical model. However, a neural network, as a tool of knowledge, can analyze large amounts of data not only faster, but also differently to our brains. Such a StyleGAN model then becomes a tool to visually represent the patterns within the observed object that the neural network has recognized.

StyleGAN as a light-beam shining through the dataset. Illustration by Lenka Hamosova, 2022

But this new knowledge is very subtle. Probably it can only be perceived through emotion. Several artists have experimented with StyleGAN and speak of its ability to derive an “essence” of the observed object. For example, the studio Entangled Others in their project Beneath the Neural Waves trained a neural network on images of coral reefs to create a new form of artificial life (as an alternative to the dying one). The neural network recognized the visual features of the coral reefs from the dataset and created layers of essences, or “something-ness”. As Feileacan McCormik says, we are not looking at new examples of jellyfish, but at “jellyfish-ness”. In following this process, artists are “dreaming up new ecosystems” — with active dreaming seen as another form of storytelling. (Uroboros Festival 2021)

By making dataset creation a lengthy and tedious process — almost to the point of being like the work of monks in scriptoria — artists are immersed in a deep meditation on their object of interest. In her artwork Myriad (Tulips), artist Anna Ridler engages in an even deeper relationship with the observed object by creating each photo in the dataset herself. She photographed more than ten thousand different tulips, and then sorted, annotated, and aligned every one of them by hand. That is in direct contrast to the way large datasets are built — usually using underpaid workers (i.e., Mechanical Turk) or purloining the images from the Internet. Such datasets contain countless labeling errors, leading to biased results (the most famous example of a biased dataset is that of ImageNet). Such a deep meditative artistic process — a form of reflection — can be useful in tracing one’s own biases — which are most certainly always projected from a human-made dataset into new visuals.

Thus trained StyleGAN, which visually depicts troubling times, attempts to make this connection and is a critique of phenomena such as filter bubbles and echo chambers. It raises questions about contemporary visual representations of troubling events and disasters as well as the affective value of photojournalism.

photographs from TroublingGAN dataset

Photojournalism used as stock-photo

One thing I realized while working with the dataset was how often certain photographs are used repeatedly after the event has happened, in the form of an illustrative photograph to evoke emotion and heighten the urgency of the information being communicated. Similarly, we often see photographs depicting various explosions, fires, natural disasters, and we treat them as stock photos, even though they are records of a specific event. It becomes a problem to distinguish between documentary photography that relates to the subject depicted, and images that are used only for their affective quality.

Also in the context of the current military invasion of Ukraine — the images that circulate on the Internet as testimonies of the cruelty of the Russian army are gradually being turned into illustrative material, thus devaluing the recorded content, namely the stories of specific people. And we, as consumers of these images, gradually become indifferent to this visual information.

How the news are encountered on Instagram. Collage by Lenka Hamosova, 2022

On the other hand, if such a once-photojournalistic-new-stock-photo is replaced by generated semi-abstract visuals coming from a neural network that has learned from similar thematically identical photographs, this ethical problem ceases to exist and space is created for a different way of perceiving the image. What we need at this moment is not to distance ourselves, but a deeper engagement and space to contemplate the disturbing nature of the visual messages depicted.

Visually ambiguous synthetic imagery does not burden the viewer with unnecessary meanings and do not generate frequent information noise. Such images are devoid of context, but still carry the necessary atmosphere. Despite the absence of specific objects and scenes, these visually ambiguous images still strangely resemble a photograph. This is very confusing to the human eye. The mind is constantly trying to assign some meaning to these ambiguous compositions, however abstract. However, the assigned meaning or interpretation becomes dynamic and constantly changing, and therefore, ultimately, the atmosphere and emotional charge affects the viewer.

TroublingGAN

TroublingGAN outputs combine concrete and abstract elements, making visuals visually ambiguous with typical GAN aesthetics (various blobs, smudges, blurred parts in contrast with sharp). Although not all outcomes are the same, there are commonalities. Some visuals contain elements that resemble human figures. Interestingly, these figures have a dark hole where the eyes and mouth would be, which acts as a disturbing element, invoking the paintings of Francis Bacon. The figures are often composed somewhat baroquely, but it is important to mention that this is purely coincidental. To some extent, it could be explained through the dynamic compositions of the scenes from the dataset, which include many chaotic events involving many people.

TroublingGAN interpolation video, Lenka Hamosova, 2022

The outcomes carry a combination of different textures. A recurrent Mondrian-like grid appears, as well as small patches of repeating detailed patterns resembling Photoshop stamp tool errors. There is a lot of fluidity: haunting blobs morph into violent smudges that morph into imperfect straight lines with remarkable brush-stroke quality. The close-ups of face-like blemishes and one-eyed figures are collaged onto abandoned landscapes or packed into claustrophobic dark interiors.

The color palette is dim, with mixed shades of gray, brown, and blue, and less frequently, white and orange. These are directly influenced by the dataset: the whites and blues are derived from the recurring medical scenes, while the orange hues are remnants of fires and explosions.

There is no one to define the composition, direct the direction of the view or the effect the image should have. And yet we perceive that these images belong to a whole, carrying a common message. We are able to perceive from them something about the nature of the object of our observation — the photographs from the dataset.

installation of TroublingGAN project in a nuclear shelter as part of the ELBE DOCK film festival, Ústí n/ Labem, Czech Republic. Photo by Lenka Hamosova, 2022

Neural networks, through their interpretation of the dataset, offer a different understanding of the affective quality of photographs. Mainly by omitting the concrete details, more subtle elements could come to the fore. It is possible to perceive a gloomy atmosphere that seems to average all the individual events in the dataset. The presence of this somethingness excludes semantic meanings, which are useless to look for. All that remains is the ever-changing meaning that our mind projects onto the visuals.

What actually happens in such a projection? Can we call it contemplation of the depicted subject? Is it possible to use such a process of futile apophenia as a method of critical reflection?

The spectacle of indefiniteness

Visual synthetic media, whether photo-realistic or visually ambiguous, form a completely new category of visual material. They bring a great deal of spectacularity to the viewer. The existence of generative models (such as TroublingGAN) creates a feeling of indefiniteness in the synthetic visuals, as they are just one of many possible outputs that the model can produce. There is a certain thrill in anticipating what the neural network will generate the next time. Many generative models turn into a dopamine-releasing addiction, like pulling a casino slot-machine handle and waiting for a new combination of symbols to appear. In her book AI Art, Joanna Zylinska criticizes GAN art as a Candy Crush intoxicating spectacle that can be addictive and so synthetic media can also have a ‘pacifying effect’ (2020). The question is whether this is due to the novelty of synthetic media and the effect wears off after a while, or whether it is a natural characteristic of synthetic media.

Emphasis on the process, speculation, reflection

During this process, a number of decisive moments occurred that defined its shift and final outcomes These were:

  • choosing a very abstract concept as the object of observation
  • the use of generative neural networks to create intuitive cognition
  • seeing potential where everyone would expect failure
  • consciously doing something absurd and illogical

All of these decisions are a rejection of the rules of common sense. That freedom to allow oneself to imagine speculative uses of generated visually ambiguous visuals in surprising contexts, such as in place of photojournalism, is what makes artistic research possible.

The emphasis on process rather than output allowed us to critically analyze the different steps of StyleGAN training, reflect and consider the ethical implications and imagining alternative possible future applications of synthetic media beyond the commercially oriented notions of the AI industry.

Diagram depicting the research process of TroublingGAN project. Illustration by Lenka Hamosova, 2022

With TroublingGAN research project I have tried to show that:

  • GANs can be used as a tool for observation and the creation of new knowledge within artistic research
  • The TroublingGAN model is able to derive an essence from a dataset representing a “troubling times” — and to project the affective quality of photojournalism from the dataset onto the generated outputs
  • It offered us a different way of looking at the photos in the dataset by revealing more subtle information hidden in them
  • Through the speculation of using these visually ambiguous visuals instead of photojournalism, I point to the unexpected potential of these “imperfect outputs from generative models and their relevance

TroublingGAN also functions as a metaphor for the vicious circle in which we find ourselves today. Unless we change the “dataset,” i.e., the input data for our thinking about the world and the way we design solutions for it, we’re stuck generating new versions of the same problems. The same applies to utopian visions of artificial intelligence, which can only be as enlightened as the quality of the input we can offer it, which should be free of all cultural stereotypes, prejudices and recurring human errors.

More info about the project:
https://troublinggan.hamosova.com

References:

Pasquinelli, Matteo, and Vladan Joler. 2020. ‘The Nooscope Manifested: AI as Instrument of Knowledge Extractivism’. AI & SOCIETY, November. https://doi.org/10.1007/s00146-020-01097-6.

Uroboros Festival. 2021. UROBOROS 2021 | Mary Ponomareva, Chris Kore, Entangled Others: Zero Emissions by 2099. https://www.youtube.com/watch?v=XXx4Q02tlIg.

Zylinska, Joanna. 2020. AI Art: Machine Visions and Warped Dreams. Open Humanities Press. http://www.openhumanitiespress.org/books/titles/ai-art/.

This essay has been presented at the Doctoral Conference and Discussion on the occasion of the Biennial Exhibition UMPRUM — Process as Output on 26th of May 2022 in Prague, Czech Republic.

--

--

Lenka Hamosova
AIxDESIGN

Researching (and practicing) creative collaboration with AI. I teach creative professionals to reclaim their creative agency in human-AI co-creation.