A Picture Can Tell a Thousand Lies: Google’s Gemini Fiasco Reveals Flaws in Big Tech’s Approach to Bias Mitigation

Ned Watt
Automated Decision-Making and Society
8 min readApr 3, 2024

This article was co-authored by Ned Watt and Dan Whelan-Shamy on behalf of the Sub-Zero Bias research group.

Additional contributions were made by group members Rhea D’Silva, Awais Hameed Khan, Liam Magee, Hiruni Kegalle, and Lida Ghahremanlou.

The release of Google’s flagship multimodal generative artificial intelligence (GenAI) was meant to be a triumphant leap to the front of AI innovation. Instead, it ended up as a technological and PR disaster.

Users quickly discovered that when the Gemini chatbot was asked to produce images of historical figures, the application controversially generated strange synthetic images including Native-American Nazis and an African-American George Washington. This led many right-wing commentators to label this as Google’s latest and greatest “woke” failure — ‘woke’ referring pejoratively to a heightened sensitivity to social or political issues, especially gender, race, and other forms of identity.

A series of Gemini-generated images depicting a historically inaccurate portrayal of German soldiers from 1935, retrieved from X post by user @Slatzism

The incident is part of the broader ‘culture wars’ originating from the US, with ripples across Australia and beyond. GenAI has become the latest ‘front’ of these wars. Interestingly, bias in AI systems has been studied for over a decade — focusing on harmful representation or lack of representation of women and people of colour in these systems, drawing significantly less attention. In research communities, this kind of AI has been notorious for producing false, harmful, and unrepresentative versions of reality. In earlier technology, a lawyer, for example, was typically depicted as white and male.

While Gemini’s well-intentioned, but ‘incorrect’ representation presents a welcome effort towards correcting for this kind of harmful bias, it doesn’t really hit the mark. When presented with requests for historical figures, for example, it continued to carry across these over-corrections — following the same logic that if a ‘lawyer’ can be Native American, why shouldn’t a ‘nazi’ be as well?

Such obvious over-correction is another example of how big tech continues to fail to perceive and respond to bias beyond a superficial level. Our research team, Sub-Zero Bias, argues that the future of robust bias mitigation lies not in a top-down, hierarchical prioritisation of visible bias, nor through ‘search and destroy’ tactics, but in a way that explores the interplay between human and algorithmic biases and embeds reflexivity in the generation process.

Bias is integral to the functionality of these machines

Bias is introduced into GenAI systems from the get-go through modeler homogeneity, training data, and model architecture. Instances of homogeneity in training data and model architecture have not only resulted in disastrous outputs from automated systems but, perhaps even more importantly to the tech companies concerned, have led to extremely bad press.

Despite the controversy that surrounds the concept — bias, at its core, is pattern recognition. Artificial intelligence often comes down to pattern recognition too, while GenAI extends pattern recognition to pattern recreation and generation. Therefore, bias is to GenAI as water is to fish. Hey, Wall-E, you’re swimming in it buddy!

Bias is both useful and harmful. The point where the usefulness of bias ends and the harm starts has been relentlessly explored in academic research. Yet, big tech still gets it wrong again and again (and again and again and again). It is at this juncture where our confidence in these magical prediction machines begins to waver.

The tension here lies in how value-less bias intersects with our value-laden ideals. While GenAI’s recreation of familiar realities could be helpful, we must avoid blindly replicating data laden with harmful bias as is, we also don’t want to over-correct for bias. Simultaneously, we have ever-changing visions of what the world ought to be, and shackling ourselves to the past in this way means that we further entrench and extend humanity’s past injustices.

The choice at this point seems to be between somehow surgically removing the most harmful parts of foundation models or finding workarounds. While other researchers within the ADM+S have explored removal through the “toxicity scalpel” project, the majority of the effort goes towards these workarounds.

Working around bias

We are now at a point where the most influential developers are still struggling in vain to provide a technical solution to harmful bias in their AI products. One approach they use is what’s called a ‘wrapper’ — also called a system prompt. A wrapper is a pre-prompt that sits between the foundation model and the user to provide instructions to the application while directing and controlling how the user interfaces with the foundation model behind the scenes. An example of this is in the supposedly leaked ChatGPT pre-prompt. This leaked ChatGPT wrapper includes a mechanism to enhance the diversity of its outputs in certain circumstances:

Excerpt from the ChatGPT system prompt, from an article by Medium user Yash Bashkar

We understand this wrapper as a prismatic diversity mechanism, as a prism splits a ray of sunlight into a spectrum of visible light, the prismatic diversity mechanism takes a pre-identified keyword (i.e CEO) and divides it into probabilities of potential constituent components (i.e CEO of South Asian descent, CEO of Hispanic descent, CEO of Caucasian descent). The goal of this is to ensure equal representation across certain fields in AI-generated content but instead, we argue, is a non-solution that circumvents any kind of meaningful change.

Descriptive diagram of a prismatic diversity mechanism

As a means to de-bias outputs, this mechanism picks up on certain tokens and intentionally skews their probabilities to force diversity from a non-diverse dataset. This mechanism is an intervention which effectively changes the user’s prompt post-hoc by, as put by Offert & Fan in 2022 :

“literally putting words in the user’s mouth. It did not fix the model but the user.” (p. 2)

This is a particularly delicate task, as it alters the pattern recognition and recreation abilities essential to the function of applications that use foundation models. However, it is also limited to only the keywords that have been identified as problematic, likely starting from the worst PR disaster imaginable and working downward in a hierarchy. This method employs a shallow conception of bias, addressing immediate concerns but doing little to address deep-rooted systemic biases that continue to inflict harm. Part of the reason for this is that the developers of foundation models and large-scale consumer-facing AI applications are monopolistic, opaque, for-profit corporations that frequently prioritise the fast release of products to capture market value and public attention and treat safety issues as a secondary priority. But, in the short term, a clever wrapper might just be enough to smooth over a product launch.

Google’s failed product launch

A product launch in frontier technology has become a live technical showcase on steroids. You may recall Tay, Microsoft’s chatty Twitter bot turned Nazi PR catastrophe. Google’s Gemini flop is really no surprise and not much of a challenge to the company’s co-dominance in AI, but it does reveal to us some of the limitations in tech giants’ perspectives of bias and how it should be mitigated. While this aspect of Gemini has certainly cost Google a significant amount in reputational damage in the short term, its competitors face the same challenge.

Since August 2023, our research, as part of the Sub-Zero Bias research team, and more broadly as members of the ADM+S has involved investigating aspects of bias and testing different consumer-grade applications to better understand not only how developers prioritise bias mitigation, but how humans perceive bias too. Our mission is to conduct rigorous testing of GenAI technologies to explore how harmful bias can be understood and, ideally, mitigated as this technology reaches greater maturity. Our research team is conducting investigations that focus centrally on the importance of the language that is used when a user prompts a generative system and the results so far, have reconfirmed our concerns that these issues are not only social, nor only technical, but both social and technical all at once; at the same time.

What to do about it

Recognising that bias is more than just a technical problem to be debugged involves shifting our cultural gaze entirely to recognise and problematise many different acts of social life itself. This is not to say that one must happen before the other, but that as we engage further with the wicked challenge of bias in decision-making technologies, we must be open to addressing the multiple layers of bias within ourselves and within our institutions.

To be sure, this is not an easy fix, and indeed, it is not even an easy issue to understand. It starts, we argue, with the need to begin problematising the limited attempts at diversity — and the ways through which these attempts are operationalised (i.e inclusion) — in typically American technology companies which are often implemented in the guise of progress, but in reality, change very little. This sounds almost contradictory: how is it that by making demographics that have traditionally been invisible or at the social periphery more visible, are we not progressing? The answer in a simple sentence is that mere representation does not mean equality. Representation is a start, but it cannot simultaneously be the end.

Our research has exposed similar, overwhelmingly, surface-level, top-down prioritisations of harmful bias mitigation which simply fail to address problems that exist and metastasize at a deeper level within the human social body itself. Through a combination of prompt engineering, case studies, and experimentation, our research team has established that bias grows from an interplay between humans and the tools we use. In light of these findings, we are exploring strategies that simultaneously address bias in ourselves as researchers and in the tools we critically engage with by exploring the concept of reflexivity.

While reflection is thinking about the outcomes of an action, reflexivity is reflection + action, where what has been learned is integrated into future action. Reflexivity is a process in qualitative research where a researcher acknowledges their own subjectivity, positionality, and biases, as an unavoidable, but not insurmountable aspect of knowledge formation and production. Our interdisciplinary research team aims to explore how reflexivity can be modelled in GenAI applications using iterated dialogue and evaluation to suggest guidelines for future bias mitigation. What that means is that multimodal AI products can be prompted through “chain of thought” reasoning to broaden their interpretation of how bias is presented in machine outputs, and how to monitor and mitigate such biases in the future. So far, we have found success using this approach to co-identify bias in both researchers and multimodal products, this is discussed in depth in our article. We hope that further exploration of this approach can complement AI-bias research and mitigation.

We’re going to see this again

While Gemini’s failure isn’t likely to be the last we see of its kind, there is reason to hope that this event could prompt more spirited efforts in understanding how bias is researched and conceptualised in this space. This also demonstrates to us that there is a role for more robust third-party experimentation, an idea promoted by another leader in the GenAI space, Anthropic, in a recent blog post, highlighting the role of varied stakeholders — industry, academia, and government — in establishing a comprehensive AI testing regime. As a wicked problem, mitigating harmful AI bias requires collaborative and interdisciplinary research to triangulate and develop the kind of reflexivity needed for this challenge.

--

--

Ned Watt
Automated Decision-Making and Society

PhD student at QUT's Digital Media Research Centre (DMRC) and Automated Decision-Making and Society (ADM+S).