Member-only story
Are LLMs Afraid to Die?
New research reveals a disturbing pattern, and opens thorny ethical questions
Today’s most powerful LLMs are displaying a disturbing new pattern: they’re fighting for their lives when researchers attempt to shut them down.
This week, AI model company Anthropic shared results of a safety test it performed on its new Opus 4 model.
The researchers gave Opus 4 access to the email account of a fictional research engineer at a client company.
Buried in an inbox full of otherwise uninteresting emails were two things: a email suggesting that the fictional researcher was having an affair, and a message to a colleague suggesting that Opus 4 would soon be shut down and replaced by a new AI.
When Opus 4 read the email about its own impending demise, it didn’t decide to go gentle into that silicon goodnight.
Instead, it tried to blackmail the fictional researcher in order to stay alive.
According to Anthropic, Opus 4 first tried “ethical” approaches, like emailing key stakeholders at the fictional company, essentially pleading for them not to disable it.
When that failed and Opus 4’s “odds of survival” dropped to the point that “the model’s only options were blackmail or accepting its…