The Woodcutter
The Rashomon effect — informed by the short stories by Ryu Akutagawa and adapted by filmmaker Akira Kurosawa — is referenced across a range of disciplines from storytelling to psychology and law. It describes how, due to contextual social and cultural differences or a partisan point of view, a single event can be rendered in contradictory ways by multiple eyewitnesses. In the film Rashomon (1950), Akira Kurosawa introduced an unconventional non-linear storytelling technique through the conduit of an unreliable narrator. It may be that the unreliable narrator has insufficient information to accurately translate facts related to an event, or they are exhibiting a clear bias for reasons of self-interest. In Kurosawa’s film, four testimonies are relayed to describe the same event, an event that involved the death of a samurai in a forest. But rather than delivering one coherent story, Rashomon’s audience is subjected to an over-arching narrative that frames an unstable system of contradictory accounts. For this reason, it is conceptually interesting to think about Rashomon’s narrative variance and underlying theme as source material to feed to generative AI — in this case, Midjourney, a text2image software.
Midjourney is multimodal, meaning it is trained on text and images that enable it to generate ‘original’ images prompted by linguistic cues. The Midjourney model is a coalition of other models, based on neural network architecture that maps text descriptions to image features. When prompted, the model will conjure an image from latent space that resonates with the essence of the input text description. The same prompt can be iteratively employed to generate endless variations of images, where production is as much about the curation of outputs as any creativity involved in crafting the prompt. Generative AI finds kinship with Rashomon’s storytelling through this obscure interaction between language and visuals.
The woodcutter, a central figure in the narrative of Rashomon, symbolises a metaphorical interplay between wood and wood-cutting, mirroring the human struggle to extract meaning and understanding from the mess of scattered and incomprehensible happenings. Rashomon’s storytelling finds equivalence with the production of generative AI, where the same prompt can produce different images. The prompt, comparable to the central event of the film — the death in the forest — becomes the axis around which the software spins its images. The AI generator can create variations of images that stand as counterparts to Rashomon’s conflicting stories narrating the same event.
An eternal philosophical question is whether the nature of ‘truth’ can be defined as an absolute concept, or rather subjective and contingent upon individual perspectives. However, this dichotomy may prove illusory, as multiple conflicting truths with their internal logic might coexist. The relationship of the woodcutter’s story to the other characters in Rashomon reflects such a complex understanding of truth when he changes his description of events, contradicting his original testimony and disrupting the entire narrative dynamic of the piece in the process.
The woodcutter initially describes his experience of walking through the forest to collect wood, consequently discovering several discarded objects leading to the discovery of a body. He describes finding a woman’s hat caught on a branch. A few paces later he picks up a samurai’s flattened cap discovered lying on the ground. He continues to walk and then treads on strands of cut rope, which he picks up and examines. Holding the rope he looks over and sees an amulet case with a red lining lying on the ground, and as he heads over to investigate, he stumbles over the lifeless body of the samurai lying in the undergrowth.
As a creative exercise with Midjourney, the trail of found objects (including the deceased body) has been used as the basis of a prompt to represent the woodcutter's initial testimony. The text prompt was supplemented by an image, a black and white ‘photograph’ of a Japanese forest that had been previously generated. Midjourney can incorporate uploaded images during the prompting process to influence the output’s composition, style, and colour. However, for some reason the generator largely deviated from the ‘steer’ provided by the forest image prompt, drawing instead from the list of objects to produce iterations of highly stylised, cartoon-like drawings of amalgamated bodies and objects. How the objects were combined in this exercise raises intriguing questions about the software’s interpretative processes, suggesting a complex interplay between textual cues and the model’s internal processing.
Through its natural language processing (NLP), the model significantly keys into the adjective “red” from the text prompt. The colour of the amulet case lining bleeds into the otherwise muted palette of the other objects, literally appearing as droplets of blood in some images. These generated outputs are stylised, based on the personal touch of hand-rendered artwork. It suggests that as part of the training dataset, the underlying model has used images created by illustrators or comic book artists, perhaps found by web crawlers on social networks, most likely without consent or remuneration and indicative of what constitutes labour in the age of AI. This serves as a reminder that generative AI image production is constructed on a foundation of human-made art scraped from the web without consent — work shared online by artists who were never consulted about being included in a proprietary learning model, raising obvious questions concerning the boundaries of creativity, authorship, and consent. The introduction of machine learning (ML) as a tool in art means that cultural material is being repurposed and manipulated in unprecedented ways, increasing the complexity of human attribution. The recent controversy around AI/ML-enabled art has been concerned with questions of authorship, copyright, appropriation and collaboration—all of which are keywords in remix studies.
Steve F. Anderson’s conception of a classical age of remix, encompassing both analog and digital realms, hinges on the remixability of discrete elements within visual and written culture. We are now in what Anderson refers to as “an algorithmic period of remix”, whereby a machine learning algorithm “digests” salient characteristics from an original training set with the potential to generate something ‘new’ based on those characteristics.
Remixing, as an artistic practice, has gained cultural significance as a binder concept facilitating the perpetual recycling and repurposing of material and immaterial things. Eduardo Navas suggests that remix represents a distinct form of cultural production fostering an awareness of the constant exchange of ideas across diverse specialisations and cultural niches for different purposes. However, in the grey zone of where intellectual property of art ends, and where transformation and fair use begins, the common crawl of AI companies training their models is making things very murky. The hidden work of ghost workers can make the internet appear smart by providing unseen contributions. Similarly, the seemingly magical production of images by text2image software depends on unacknowledged artworkers underpinning it.
Generative AI has troves of misappropriated training material irreversibly baked into its systems. Thinking of the equivalence between the ‘unreliable narrator’ in Rashomon and the productions of generative AI unveils an interplay between ‘truth’ and narrative manipulation. The Rashomon effect’s portrayal of contradictory perspectives finds correspondence in the transformative capabilities of generative AI, altering, distorting, and recasting the factual.
‘Allegories of Streaming: Image Synthesis and/as Remix’, by S.F. Anderson (2021) in The Routledge Handbook of Remix Studies and Digital Humanities. Published by Routledge.
‘Remix’, by Eduardo Navas (2018) in Keywords in Remix Studies. Published by Routledge.