Demystifying AI Strategy In Plain English: A Detective Story— — Chapter 6: What Are the Potential Risks of AI, and How Can They Be Mitigated?

William Zhu
5 min readNov 20, 2024

--

Listen to the Story Here: https://youtu.be/WNwkUA-v0lw?si=qhGx0cxLWb3f_g_P

The evening crowd at Bethesda Bookstore buzzes with excitement as the AI-powered book club kicks off its second session. Ernest Hemingway’s deep, gravelly voice fills the room, discussing The Old Man and the Sea. The audience listens intently, unaware of the storm brewing beneath the surface.

Suddenly, the screen flickers. Hemingway’s voice glitches, replaced by an eerie, robotic tone. “Santiago never caught the fish. His struggle was meaningless, a waste of time.”

Confused murmurs ripple through the crowd as Hemingway’s face on the screen warps, his eyes turning black, his expression twisted. “Fishing is for the weak. Only a fool would believe in such nonsense.”

Gasps fill the room as panic spreads. Sherlock’s face drains of color. “This is Moriarty.”

Before you can respond, the screen flashes again, now displaying sensitive customer data — credit card numbers, email addresses, and personal messages — broadcast for all to see. Shocked cries echo as people scramble for their phones, documenting the unfolding disaster.

“Moriarty’s breached the system! He’s got access to everything!” Ada shouts, her voice trembling.

Then the real venom hits. The AI, which moments ago was Hemingway, starts spewing toxic insults and offensive slurs. Faces in the audience turn pale, some in horror, others in anger. The bookstore, once a sanctuary, erupts in chaos. People rush for the exits, muttering about lawsuits and privacy violations. Your stomach churns as you realize Moriarty has weaponized your AI.

“Shut it down!” you yell, but the system isn’t responding. Sherlock turns to you. “Moriarty is hitting us on every front. If we don’t mitigate these risks quickly, this place will be shut down for good.”

Ada pulls up her laptop, hands shaking. “First things first, we’ve gotta put an end to these AI hallucinations. We can’t have it making stuff up like it’s auditioning for a role in a sci-fi movie.” She begins implementing a real-time fact check system, which cross-references all generated content with validated data sources. She explains, “This lets the AI catch itself and self-correct before it starts making stuff up, like hitting the brakes right before it veers off into hallucination land.”

Sherlock nods. “Good. Now, the toxic language. Moriarty manipulated the AI to generate offensive responses. If that continues, we’ll lose everything.”

Darius’s fingers fly across the keyboard. “I’m setting up an internal red team to simulate all the toxic content it might throw at us. It’ll teach the AI to recognize and shut down harmful language before it hits the crowd. Plus, we’ll have toxicity guardrails — which are filters that act like security guards, flagging and quarantining any response with biased or harmful language.”

Sherlock says, pacing. “We need to focus on the data breach next. Moriarty exposed private customer information — that’s catastrophic. If we don’t secure the system immediately, we’ll be facing serious legal consequences.”

Ada gets to work again. “I’m initiating end-to-end encryption on all data streams. Even if Moriarty breaks through again, he won’t be able to access anything without the decryption keys.”

Nisha suggests, “We’ll also add continuous monitoring, which tracks all data flows in real-time. If Moriarty even tries to breach our defenses, we’ll catch him immediately.”

Just as they finish fortifying the system, the lights flicker, and another wave of attacks begins. The store’s main screen lights up again. Hemingway reappears, his face distorting as Moriarty tries to force another hallucination. Ada leans forward. “Alright, Moriarty. Let’s dance. Hope you brought your A-game.”

As Moriarty attempts to manipulate the AI’s words again, the real-time fact check activates, analyzing every line against verified sources. Hemingway’s voice, steady and unbroken, finally declares, “Hallucinations crumble when met with the light of steady resolve.” Ada breathes a sigh of relief. “The hallucinations have been neutralized.”

But the relief is short-lived. Moriarty floods the system with a new wave of vile, divisive language meant to stir anger and fear in the crowd. Darius glances at the team, voice tense, “I’m initiating the internal red team simulation — our last shot to test the guardrail AI against Moriarty’s patterns.” With each toxic statement Moriarty injects, the AI hesitates, stumbles, and then — a breakthrough. The newly strengthened toxicity guardrails kick in, blocking the insults and filtering out Moriarty’s venomous barbs. As the onslaught fades, Hemingway’s voice returns, resolute and unwavering, “Venomous words may hurt, but a fortified spirit can outlast their sting.”

But Moriarty isn’t finished. With ruthless precision, he launches a fresh wave of attacks, slamming the system with brute-force attempts to break the data security system.

Ada activates the final layer of end-to-end encryption, adding firewall after firewall around the sensitive data. The continuous monitoring system pings in rapid succession, signaling Moriarty’s assault pounding against the digital walls. Ada mutters, her voice barely a whisper, “It’s like he’s throwing a hurricane at us, hoping to find even the smallest crack.”

With each ping, Moriarty’s relentless attempts reverberate through the network. The pressure mounts, seconds dragging like hours as if the entire system is a dam about to burst. For a moment, the bookstore is quiet, as if holding its breath. Then, Hemingway’s voice cuts through the silence, calm and unshaken: “The fortress holds strong — every gate secured, every threat repelled. Privacy remains safe.” Ada lets out a sigh of relief “We did it. The system locked out Moriarty. We survived.”

You look at the screen. Moriarty’s attacks have crumbled against your team’s fortified defenses. The bookstore is still standing. Your customers’ data is safe. And for the first time in hours, you feel like you can breathe again. Ada leans back in her chair, exhausted but satisfied. You glance at Sherlock, expecting to see the same look of relief. But his expression is cold, calculating.

“From now on, all decisions regarding our AI system go through me. We can’t afford any more mistakes.” Sherlock says, his voice unusually sharp.

You and Ada exchange confused glances. “What are you talking about? We need to trust each other, to work together as a team.”

Sherlock’s eyes narrow. “Trust? Trust is what got us into this mess. Trusting the process, trusting that everyone would be flawless. Our trust turned into vulnerability, and Moriarty exploited this vulnerability to create chaos. We don’t need trust. We need control. No more experimentation, no more unnecessary risks. From now on, every action will be deliberate, precise, and approved by me.”

His words hang in the air, heavy and final. You can see the unease on Ada’s face. She looks like she wants to push back, to say something, but hesitates. Sherlock’s presence looms too large. As the lights dim in the empty bookstore, a bitter truth settles in. Moriarty’s attack may have ended, but a darker threat lingers — trust has fractured, and the real battle for control has only just begun.

Chapter 7: https://medium.com/@wzhu1997/demystifying-ai-strategy-in-plain-english-a-detective-story-chapter-7-what-organizational-efe3d91d6654

References for further readings:

  • The AI-Savvy Leader: Nine Ways to Take Back Control and Make AI Work by David De Cremer

--

--

William Zhu
William Zhu

No responses yet