Do We Really Need Dedicated Guardrails for Generative AI?
In this blog post, Fernando Mourao, the Head of Responsible AI at SEEK, explores a relevant question for businesses aiming to leverage Generative AI responsibly. He offers an intuitive and objective viewpoint on the differences between narrow AI and Generative AI. The following text represents an example of AI augmentation capability, where ChatGPT-4o was employed to enhance the author’s original draft, resulting in a more engaging narrative.
The hype around Generative AI and its potential has sparked great expectations about how companies can revolutionise services and significantly improve quality in the near future. Meanwhile, numerous AI experts and governments worldwide have advocated for the implementation of specific guardrails for Generative AI to minimise risks introduced or exacerbated by this technology. These dedicated guardrails are essentially mechanisms that AI providers should have to monitor over time, mitigate and, mostly, prevent some risks from materialising.
Interestingly, while the potential and capability of Generative AI are clear to many, the need for specific actions to prevent associated risks generates some scepticism. Even some developers and AI practitioners struggle to understand or accept that Generative AI needs specific guardrails, in addition to the existing ones widely applied to ‘traditional’ AI. The argument is that Generative AI is simply a subtype of AI, so the mechanisms and practices adopted for AI risk management should suffice for Generative AI.
So, one of the most frequent questions for AI-driven businesses is: do we really need dedicated guardrails for Generative AI?
This post primarily targets non-AI experts concerned with the question above. For those seeking a more technical discussion on what AI guardrails are and how to use them, there are comprehensive materials available online (like this comprehensive guide). My aim is to provide a non-technical and intuitive discussion on the necessity of specific guardrails for Generative AI, making the topic accessible and inclusive to a broader pool of individuals capable of contributing diverse and constructive perspectives. Also, the discussion of how these guardrails should be defined, whether through hard or soft regulation, is beyond the scope of this post. Finally, I do not intend to propose specific guardrails, as this would require a much longer and context-dependent discussion.
Narrow AI vs Generative AI
To explore this technical and complex question in an accessible and intuitive way, we need to introduce some basic concepts. Let’s first understand the difference between ‘traditional’ AI (narrow AI) and Generative AI.
Narrow AI refers to AI systems designed and trained to perform a specific task or a limited range of tasks. These systems are highly specialised and excel in their designated functions but cannot perform tasks outside their programming. Examples include video or advertisement recommendation algorithms widely used in social networks and most online service providers.
Generative AI is a type of AI that can create new content, such as text, images, music, or even video, based on the data it has been trained on. Unlike narrow AI, which follows predefined rules, generative AI uses patterns and data to produce original outputs. Examples include AI systems that can write articles, generate artwork, or compose music.
The Creation Process of AI Solutions
Understanding the creation process of AI solutions is crucial for our discussion.
For narrow AI-powered solutions, developers (e.g., Data Scientists and ML Engineers) define a very precise goal, such as categorising documents or scoring an item. The focus is on accuracy and how closely the algorithm matches the correct answers, which are usually provided in advance by the developers via training data.
In contrast, the approach to developing Generative AI-powered solutions is slightly different. Instead of aiming for a single correct answer, developers drive the model to produce a range of acceptable solutions through an interactive process that refines the model’s behaviour (e.g., prompt engineering). In this case, there are multiple ways to solve a given problem, some of which may even be unforeseen by humans.
AI Olympic Games
To make this discussion more intuitive, let’s visualise the differences in the creation processes of narrow AI and Generative AI using an analogy to Olympic sports.
Suppose we’re training AI athletes to compete in different modalities. Narrow AI is like archery, where accuracy and precision are paramount. Here, quality measurements focus on how close the arrows fired by our AI athletes are to the target. Safety for athletes and the audience is manageable with standard measures, like placing the target in a safe location.
Generative AI, however, is akin to floor exercise in artistic gymnastics. There isn’t a single valid exercise, and the range of possibilities is vast, with exercises unforeseen by humans. While the accuracy of each movement is still crucial, many other quality dimensions come into play. Guardrails become essential for defining what is acceptable and ensuring safe conditions to protect athletes and the audience against risks. Some exercises might be too risky, or athletes might jump out of the mat. The training process and concerns for our Generative AI athletes are entirely different, requiring different training approaches and guardrails.
The Actual Generative AI Risks
From the previous analogy with AI Athletes, we understood that narrow AI and Generative AI bring different types of risks. But what are those risks in practice?
AI and Generative AI risks have received extensive coverage in the media, and we can easily read about them in the news and on social media. Organisations like the OECD, NIST, and other AI authorities worldwide have extensively described these risks and introduced various taxonomies. Recently, MIT consolidated an AI Risk Repository with over 700 AI risks categorised by cause and domain. Rather than discussing each risk in detail, let’s focus on a couple of relevant ones to illustrate the potential negative impacts that negligent or immature utilisation of AI could bring to individuals, communities, organisations, and societies.
A significant set of AI risks is related to unreliable outcomes. AI might produce outcomes that are incorrect. While this problem is present in narrow AI, it is exacerbated by Generative AI. In some cases, achieving reliable outcomes using Generative AI is particularly challenging. For example, statistical values and aggregated views produced by Generative AI over business reports are often wrong. Additionally, checking accuracy in Generative AI is sometimes more difficult than in narrow AI, since the latter is usually designed with a clear definition of desired outcomes. A recent incident that illustrates this challenge is the case of an AI-generated image of Katy Perry at the 2024 Met Gala that went viral, fooling even her own mother. These images were so realistic that they misled millions of people online, showcasing how Generative AI can create convincing yet false content.
Replication of harmful behaviour is another set of risks related to this technology. Both AI and Generative AI can easily replicate or amplify discriminatory behaviours that humans exhibit in the real world. For instance, Generative AI can create biased images, like showing only certain races in specific roles. An example of this is when Generative AI is asked to illustrate a group of successful women and it produces biased representations based on existing stereotypes.
Guardrails for Generative AI
Besides clarifying that the risks introduced by narrow AI and generative AI are different, we need to understand why existing ‘guardrails’ for narrow AI are not robust enough for Generative AI.
Imagine you are training a narrow AI model to identify spam emails. You can clearly define what constitutes spam and label examples accordingly. Rigorous data labelling is a kind of guardrail in this case, helping the AI model learn to distinguish between spam and legitimate emails. This works well because the AI’s task is to categorise emails into predefined categories: spam or not spam.
Now, consider a Generative AI model designed to create artwork based on a prompt. Traditional guardrails like predefined labels won’t suffice because the model is creating new content, not classifying existing content. For example, if asked to generate images of “successful people”, the model might inadvertently produce biased images reflecting stereotypes if not properly guided. In this case, it is impossible to label all input data with all possible labels we can ask for a Generative AI model.
Therefore, Guardrails for Generative AI need to be more robust and dynamic, addressing a broader spectrum of potential issues and ensuring safety and reliability in more complex and varied scenarios.
Call to Action
It is crucial to understand that both narrow AI and Generative AI can introduce new risks or, most commonly, exacerbate existing risks for businesses, people, communities, or society. Effective AI governance and Responsible AI best practices are essential components for any AI-driven business.
Organisations should proactively work to maximise the benefits of this technology while minimising risks and potential negative impacts. We don’t need to wait for new regulations to act. A good starting point is to become Responsible AI (RAI) literate and truly understand this technology to make well-informed decisions. I suggest that those interested in starting this journey should consider the initiatives and reports produced by the Australian National AI Center (NAIC), The Alan Turing Institute, and NIST. I hope this discussion could assist you in reflecting on important actions and essential guardrails to put in place to leverage Generative AI responsibly in your business context.