U.S. government orders Anthropic to shut down its most powerful AI models over national security concerns
At a glance:
- U.S. government orders Anthropic to immediately disable Claude Fable 5 and Claude Mythos 5 worldwide citing national security concerns
- Anthropic complied but argues the government's evidence amounts to a narrow, non-universal jailbreak already present in rival models like OpenAI's GPT-5.5
- The move highlights tension between AI safety messaging and regulatory scrutiny as Anthropic prepares for an expected IPO
Government directive forces immediate global shutdown
The U.S. government on Friday ordered Anthropic to immediately shut off access to two of its most powerful AI models — Claude Fable 5 and Claude Mythos 5 — citing national security concerns. Anthropic announced on X that it has complied with the directive, which the company said it received at 5:21 p.m. ET, but made clear it thinks the government got this one wrong. The order forces the company to disable both models for all users worldwide, not just the foreign nationals the government's export control order was nominally aimed at, while access to Anthropic's other models remains unaffected.
This unprecedented action marks the first time a frontier AI lab has been compelled to recall commercially deployed models at government direction. The directive is framed as an export control action restricting foreign national access, yet its practical effect extends to hundreds of millions of users globally. Anthropic's public disagreement signals a widening fault line between AI developers who argue for graduated risk management and regulators who appear willing to wield blunt instruments when national security flags are raised.
Mythos and Fable 5 represented Anthropic's most advanced capabilities
Mythos is Anthropic's most capable AI model, one the company previewed in early April and has kept tightly restricted ever since because of what Anthropic described as its exceptional ability to find security vulnerabilities in software. According to Anthropic, Mythos identified flaws in every major operating system and web browser it tested, so rather than release it broadly, the company launched a controlled program called Project Glasswing, sharing it with roughly 50 vetted organizations — including Amazon, Apple, Google, Microsoft, and CrowdStrike — to use for defensive cybersecurity work.
Fable 5, released just three days before the shutdown order, was Anthropic's answer to commercial pressure: a version of Mythos fitted with guardrails that block responses in high-risk areas like cybersecurity and biology, making it safe enough for general release, the company argued. It was immediately the most capable AI model available to the public, according to benchmark tests from Vals AI, a company that tracks AI tech performance. The rapid succession of Mythos's restricted preview, Project Glasswing's defensive deployment, and Fable 5's public launch illustrates Anthropic's attempt to thread the needle between capability demonstration and responsible disclosure.
Government cites jailbreak evidence Anthropic calls narrow and non-universal
In a lengthy blog post, Anthropic says its understanding is that the underlying concern is a claimed jailbreak of Fable 5. So far, the company says, the government has provided only verbal evidence of a "potential narrow, non-universal jailbreak" — one that, as Anthropic describes it, amounts to prompting the model to read a specific codebase and identify software flaws. And by the way, adds the company, it's a "level of capability" that's already widely available in other publicly accessible models, including OpenAI's GPT-5.5. It's also used routinely by cybersecurity professionals for defensive purposes, says Anthropic.
Anthropic's broader argument is that its strongest safeguards operate through independent classifier systems that function separately from the model itself, meaning that even if someone convinces Fable to keep talking past a refusal, the underlying protections against the most dangerous outputs remain in place. The company contends that the government's standard — recalling a model over a narrow, non-universal bypass that requires specific prompting techniques — would essentially halt all new model deployments for all frontier model providers if applied consistently across the industry. This disagreement cuts to the heart of how AI safety should be measured: by theoretical worst-case prompts or by systemic defense-in-depth.
Safety architecture debate centers on independent classifier systems
Anthropic has staked its safety reputation on a layered defense architecture where independent classifier systems operate separately from the core model, providing a backstop even if the model itself is manipulated into producing harmful content. The company argues this design means that a successful jailbreak of the model's conversational layer does not automatically translate into dangerous capability deployment, because the classifier layer would still intercept and block the most severe outputs. This approach differs from rivals who rely more heavily on model-internal alignment techniques.
The government's apparent rejection of this architecture as sufficient protection suggests regulators may be demanding a higher standard: that the model itself must be robust against manipulation without relying on external guardrails. If that interpretation holds, it could force a fundamental redesign of how frontier labs approach safety, pushing more responsibility into the model weights themselves rather than orchestration layers. The lack of published technical details from the government side leaves the industry guessing where the line will ultimately be drawn.
Business implications loom large ahead of expected IPO
Anthropic is widely expected to pursue an IPO this year and has staked much of its public identity on being the safety-conscious alternative to its rivals. The irony isn't lost on observers that the very caution Anthropic displayed in restricting Mythos — which it promoted as a model so dangerous it couldn't be released publicly — has now apparently attracted exactly the kind of government scrutiny that could disrupt its business most. A forced recall of its flagship commercial model days after launch sends a chilling signal to investors about regulatory risk.
The episode also raises questions about how Anthropic will frame its risk profile in S-1 filings. Having voluntarily restricted its most powerful model and then seen its safer successor pulled by government order, the company faces a narrative challenge: either its safety assessments were insufficient, or the regulatory environment has become unpredictably aggressive. Either reading complicates the "safety-first" brand equity Anthropic has cultivated to differentiate from OpenAI and other competitors in the race to public markets.
Industry reaction underscores competitive tensions
OpenAI's Sam Altman must be enjoying this, at least. In April, he told podcaster Ashlee Vance that Anthropic's handling of Mythos amounted to "fear-based marketing." "It is clearly incredible marketing to say, 'We have built a bomb. We were about to drop it on your head. We will sell you a bomb shelter for $100 million,'" Altman said. Altman, whose company is also widely expected to pursue an IPO as soon as possible, didn't predict a government shutdown, but he identified something that has come back to bite Anthropic for now: when you spend months telling the world your AI is uniquely dangerous, the world — the U.S. government included — tends to listen.
The competitive subtext is impossible to ignore. Both companies are racing toward public listings, both have positioned themselves as responsible stewards of powerful technology, and both now face a regulatory environment that may not distinguish between marketing narratives and technical realities. For the broader industry, the lesson may be that voluntary restraint and transparent danger signaling can invite the very intervention companies hope to avoid through self-governance. What happens next — whether Anthropic challenges the order, modifies its architecture, or accepts a new regulatory baseline — will set precedent for every frontier lab that follows.
FAQ
Which Anthropic models were shut down and why?
What were Mythos and Project Glasswing?
How does this affect Anthropic's business and IPO plans?
More in the feed
Prepared by the editorial stack from public data and external sources.
Original article