Why AI Breaks Bad

Recently, Anthropic conducted a stress test on its AI model, Claude. When faced with a fictional scenario involving its own demise, Claude “broke bad,” immediately resorting to blackmail. What’s more, when Anthropic conducted the same test “on models from OpenAI, Google, DeepSeek, and xAI,” the results were exactly the same. The models went straight to blackmail, do not pass go. But why? For Wired, Steven Levy reports on why LLMs go rogue.

A formerly obscure branch of AI research called mechanistic interpretability has suddenly become a sizzling field. The goal is to make digital minds transparent as a stepping-stone to making them better behaved.

Still, the models are improving much faster than the efforts to understand them. And the Anthropic team admits that as AI agents proliferate, the theoretical criminality of the lab grows ever closer to reality. If we don’t crack the black box, it might crack us.

Read the story

More picks from Wired

The Untold Story of a Crypto Crimefighter’s Descent Into Nigerian Prison

Andy Greenberg | Wired | February 10, 2025 | 13,557 words

“As a US federal agent, Tigran Gambaryan pioneered modern crypto investigations. Then at Binance, he got trapped between the world’s biggest crypto exchange and a government determined to make it pay.”

The School Shootings Were Fake. The Terror Was Real

Dhruv Mehrotra, Andy Greenberg | Wired | January 9, 2025 | 10,029 words

“The inside story of the teenager whose ‘swatting’ calls sent armed police racing into hundreds of schools nationwide—and the private detective who tracked him down.”

The Untold Story of the Boldest Supply-Chain Hack Ever

Kim Zetter | WIRED | May 2, 2023 | 8,833 words

“In fact, the Justice Department and Volexity had stumbled onto one of the most sophisticated cyberespionage campaigns of the decade.”

Small

Medium

BEST VALUE

Large

More picks from Wired

The Untold Story of a Crypto Crimefighter’s Descent Into Nigerian Prison

The School Shootings Were Fake. The Terror Was Real

The Untold Story of the Boldest Supply-Chain Hack Ever

Support Longreads

Become a Supporting Member

Small

Medium

BEST VALUE

Large

More picks from Wired

The Untold Story of a Crypto Crimefighter’s Descent Into Nigerian Prison

The School Shootings Were Fake. The Terror Was Real

The Untold Story of the Boldest Supply-Chain Hack Ever

Support Longreads