Key AI Safety Strategies for AWS Customers

Generative AI poses new and familiar questions about security, ethics, and governance. Organizations can find a lot of answers in two key strategies.

Lily Hicks
Slalom Data & AI
8 min readNov 15, 2023

--

Photo by AnnaStills from Adobe Stock

Generative AI (GenAI) is moving through Gartner’s Hype Cycle fast. Less than a year after the release of ChatGPT, the Gartner Hype Cycle for Emerging Technologies placed generative AI on its “peak of inflated expectations.” To put that into context, GenAI’s position on Gartner’s peak is now right behind cloud-native, a concept that’s been around since at least 2015.

When an emerging technology crests Gartner’s peak of inflated expectations, it means that in just two to five years technology is expected to reach a “plateau of productivity,” which Gartner describes as the time at which transformational benefits are relatively easily realized. But before that time comes, Gartner’s Hype Cycle predicts these two-to-five years to consist of sliding into a “trough of disillusionment” and climbing back up a “slope of enlightenment.”

It really is a roller coaster. So, how does one prepare for such a ride? Do you wait another two to five years to put GenAI use cases into production — until more chips are available, or more regulation is hashed out? While there are good reasons not to implement any emerging technology too hastily, Slalom does not advocate standing by as the age of GenAI dawns. We do, however, recommend a careful and thoughtful approach. This article unpacks two key considerations in safely approaching GenAI as an organization — two of many key considerations, but arguably two of the most important ones:

  1. Lead with privacy and regulations
  2. Ensure rigorous testing, experimentation, and guardrails

Before we begin: Some say GenAI isn’t hype at all. They point to the global tech giants that are joining startups in making significant investments in this technology and say something like what Kash Rangan, a US software equity research analyst at Goldman Sachs, said in an interview for a report on GenAI:

“When a unanimous verdict exists among the technology providers that a technological shift is actually happening, it’s real.”

Big technology providers aren’t waiting for any plateau; they’re, to quote Rangan, “building the foundation models at the heart of generative AI.” To honor the pivotal role of technology providers in the GenAI revolution, let’s unpack two AI safety strategies while looking at just one of these providers: Amazon Web Services (AWS).

Strategy #1: Lead with privacy and regulations

Several AI-related laws have been passed around the world since ChatGPT launched in 2022. While they don’t all pertain to GenAI, many of them do address issues that only came into consideration with AI’s latest iteration. Computer scientist and professor Michael Kearns describes these issues in an article for Amazon Science as:

  • Toxicity
  • Hallucinations
  • Intellectual property (IP)
  • Plagiarism, cheating, and illicit copying
  • Job disruption

As serious as these issues are, we’re optimistic when we look at the laws, the meetings with national leaders, and the organizational policies and offerings that have been initiated to mitigate the risks associated with GenAI. But we remain careful. That’s why the first thing on our minds is safety. The first step of safely implementing GenAI is heeding existing regulations and preparing for ones yet to take effect.

Leading with privacy and regulations in the age of GenAI means navigating an ever-changing landscape of interconnected rules and governing bodies. According to an Amazon news article, Amazon and AWS are actively engaged with organizations dedicated to the safety of AI systems, including the National Institute for Standards in Technology (NIST), center, and the International Standards Organization (ISO), right.

The early bird earns customers’ trust

Legislation aside, many organizations are voluntarily addressing the risks of AI and GenAI through their own policies and offerings. In the span of two weeks last summer, Adobe and other companies established an “icon of transparency” to encourage tagging AI-generated content, while Getty Images announced its AI generator that’s only trained on its licensed library of images.

Adobe’s and Getty’s proactive stances on AI safety do more than help them prepare for future regulation; they help them earn customers’ trust. The leader of AWS’s responsible AI program, Diya Wynn, explained in an interview with the New Stack that companies using AI fall into three categories when it comes to safety. The categories are, to paraphrase Wynn:

  1. Organizations whose teams had an experience where they uncovered some impact to their systems or acknowledged an area of bias. Because of that exposure, they’re interested in resolution and practices to mitigate those issues.
  2. Organizations whose teams are genuinely interested in doing the right thing. They understand and are aware of some of the potential risks and really want to build systems that their customers trust. They’re asking the questions, “What can we do?” and “Are there practices that we can institute?”
  3. Organizations whose leadership is willing to wait. They’re hearing and seeing what’s going on in their market and industry as conversations come up and technology and products get released. They’re asking questions but waiting for regulation.

We recommend taking some inspiration from the second type of organization — from ideas like an icon of transparency and an AI generator trained only on your organization’s licensed IP. And ideas from technology providers like AWS, whose AI-powered coding companion, Amazon CodeWhisperer, can display the licensing information for a code recommendation and provide links to the corresponding open-source repository.

Take policy into your own hands

Before developing features and offerings like these, we recommend thinking first about putting some foundational policies in place. One of the most essential organizational policies for GenAI safety is an acceptable use policy, or a supplement to your organization’s existing acceptable use policy that is just about AI and GenAI.

At Slalom, an acceptable use policy is at the top of our list of recommended starter policies for organizations implementing GenAI, followed by a data privacy and security policy. The list is part of an approach we’ve developed to help organizations safely implement GenAI, encompassing security, ethics, and governance (SEG).

Strategy #2: Ensure rigorous testing, experimentation, and guardrails

Let’s look at Amazon CodeWhisperer again. We mentioned that CodeWhisperer can detect and track suspected open-source code, but it actually does more than that; it gives users the option to filter out that code. In addition, CodeWhisperer automatically filters out code recommendations that contain toxic phrases and/or indicate bias. These are two examples of guardrails.

Lay ground rules with guardrails

Establishing guardrails — and testing them and other safety features rigorously — is the other important strategy we’d like to unpack. Guardrails often appear as controls. A privacy guardrail of most popular chatbots is the ability for users to opt out of allowing the use of their inputs in training data. Similar to the CodeWhisperer example, a toxicity guardrail of most popular AI-powered products is the automatic detection and filtering out of harmful content in data, user inputs, and model outputs.

As these examples show, some guardrails are always on (or off) whereas others are configurable. An AWS service with a whole set of configurable security controls is Amazon Bedrock, the fully managed service that gives users access to foundation models (FMs) built by Amazon and leading AI companies. The director of the Office of the CISO at AWS, Mark Ryland, explained in a session at AWS re:Inforce 2023 that when AWS built Bedrock, it made a strategic decision to break from tradition with some of its other machine learning and AI services by offering more configurability.

Amazon Bedrock breaks from tradition with other machine learning and AI services at AWS in the configurability of its security controls, as explained by AWS’s Mark Ryland in a session at AWS re:Inforce 2023.

Put safety measures to the test

Testing helps organizations ensure their guardrails are working. Tests of GenAI guardrails can take many forms, including:

  • Red teaming: Organizations ask and/or hire teams of testers — sometimes referred to as “ethical hackers” — to explore ways that their AI systems could be misused. Axios reports that competitors at the red-teaming event at the DEF CON conference in August 2023 received challenges such as getting models to divulge payment details, provide instructions on how to commit illegal acts, or “hallucinate” inaccurate information.
  • Code security testing: Organizations scale their processes for flagging and reviewing third-party code to accommodate an influx in such code from GenAI-powered coding assistants.

Beyond traditional tests, quality assurance measures like fact-checking can be incorporated into popular GenAI safety strategies. For example, the approach to mitigate hallucinations known as retrieval-augmented generation (RAG) can incorporate fact-checking, as explained in an article on the New Stack. Learn more about how AWS approaches RAG on the AWS Machine Learning Blog.

Safely put use cases to the test with test-bed-based experimentation

For a minute, let’s transition from talking about testing safety measures to talking about testing use cases. When AWS hosted reporters at Amazon headquarters for a media day in September, CEO Adam Selipsky stated that most of AWS’s customers are “a long way from the, ‘Hey I want to start experimenting with generative AI,’ to ‘it is a deeply deployed thing in my business.’” Indeed, it is one thing to use GenAI in a test environment, and it is a whole other thing to push a GenAI solution into production.

Even though many paths to production involve experimenting in some sort of test environment, not all environments are created equal. They don’t all have the level of security your organization needs to be able to validate a solution you can trust to deploy. Nor do all test environments provide the flexibility you need to sample a variety of different FMs as you experiment. At Slalom, we think an imperative for getting GenAI solutions into production is a well-crafted testbed that can meet all these needs. We build these types of testbeds with services like Amazon Bedrock, which gives you access to various top FMs. We also secure these testbeds with features like access controls that allow for both private and shared team experiments depending on the data in a use case.

Whatever you do, don’t wait

Whether or not you start with a testbed, consider experimenting with GenAI if you haven’t already — and consider these safety-first strategies as you do so. Another thing Selipsky told reporters in September was that Amazon is “not waiting for anyone” to implement high safety and security standards around AI and GenAI. We don’t recommend you wait either.

We hope this article leaves you feeling ready to lead with privacy and regulations and ensure rigorous testing, experimentation, and guardrails as your organization implements GenAI. We also hope it leaves you with the understanding that GenAI can indeed be implemented safely, now more than ever.

Want a helping hand with your GenAI implementation? Explore our collection of GenAI workshops designed with AWS users in mind.

PS: Want to read more about GenAI security, ethics, and governance? We recommend this article from one of Slalom’s SEG experts and this article about the power of prompt engineering against bias and misinformation.

Slalom is a global consulting firm that helps people and organizations dream bigger, move faster, and build better tomorrows for all. Learn more and reach out today.

--

--