Reasons Behind the US Government’s Shutdown of Anthropic’s Latest Claude AI Model

Reasons Behind the US Government’s Shutdown of Anthropic’s Latest Claude AI Model

On June 12, artificial intelligence (AI) lab Anthropic suspended access to its latest Claude models, Fable 5 and Mythos 5, which had been released three days earlier.

The move came in response to an “export control directive” from the US government prohibiting use of the models by anyone who is not a US national.

Mythos is Anthropic’s most powerful, or “frontier”, model. When first announcing the model in April, the company said it was too good at hacking to release immediately. Instead, Mythos was made available to a handful of organisations (mostly US tech corporations) to use to patch weaknesses in essential digital systems.

Fable is the same basic model, but with added safeguards meant to stop it being used for cybersecurity purposes. This is what was released to the public last week – and almost immediately shut down.

Anthropic and the Trump administration at loggerheads

Since early 2025, Anthropic and the Trump administration have been in escalating conflict. The administration has accused Anthropic of making “woke AI” and called chief executive Dario Amodei an “ideological lunatic”.

Early disagreements concerned AI regulation and semiconductor export policy. The dispute sharpened when Anthropic declined to let the Pentagon use its models for domestic surveillance and fully autonomous weapons systems.

The Department of Defense responded by threatening to designate Anthropic a “supply chain risk”, a classification that would have required military contractors to sever ties.

Jailbreaks

The US government has not yet publicly stated the reason for last week’s directive, but Anthropic it says it believes the government became aware of a jailbreak: a method for circumventing the safeguards in Fable that prevent using its most powerful features for nefarious purposes.

These safeguards classify user requests as safe or unsafe before passing them to the AI model. When triggered, the safeguards redirect the request to a less powerful model.

The government’s concern, according to Anthropic, was that the safeguards could be bypassed to extract information useful for cyberattacks.

Guardrails for large language models aren’t bulletproof. They mostly depend on the model’s own capacity to interpret the user’s intentions in making a request.

Beyond the inherent difficulty of this task, a large online community (which my colleagues and I call the Undersphere) is working hard to circumvent AI guardrails. Anthropic acknowledges that “perfect jailbreak resistance is not achievable for any current model provider”.

Anthropic says the research behind the government directive appears to have been produced by engineers at Amazon, which is both a rival to Anthropic and a significant investor.

But this was not the only relevant jailbreak. Within 48 hours of Fable’s release, a researcher using the pseudonym “Pliny the Liberator” published what they identified as Fable 5’s full system prompt to X and GitHub repository.

The system prompt is a hidden set of instructions that helps determine an AI model’s behaviour. It’s unclear exactly how knowledge of Fable’s system prompt could be used in practice, but it has drawn attention in the Undersphere.

A surprise – and an ongoing mystery

The deepest problem of making large language models such as Fable secure is that we don’t fully know how they work. According to Oxford University economist and machine learning expert Maximilian Kasy, they work much better than they “should”.

Large language models have billions of internal parameters and are trained on unimaginably vast piles of data using machine learning methods. According to Kasy, we would expect such systems to be “overfitted”: good at reproducing patterns in their training data, but bad at generalising to new situations.

However, modern systems such as Claude and ChatGPT do seem to be able to generalise. Kasy likens modern AI development to alchemy: successful through trial and error, not yet grounded in systematic theory.

As a result, the behaviour of AI models is partly opaque even to their builders.

Hard to regulate

The opacity of the technology is one key reason it’s so hard to regulate. Governments lack independent access to the data, infrastructure and expertise they would need to evaluate proprietary frontier models.

The US administration’s recent executive order on AI security, published two weeks ago, reflects this realisation. As the administration has realised the power of frontier AI models, it has moved from an initial hands-off posture to asking developers to share their models for review before release.

That demand is an implicit admission that the administration does not trust the companies to evaluate, fully and comprehensively, what their own models can do and how they might be misused. The public sees even less, and the consequence is measurable: a survey taken across 25 countries last year found people are, on balance, more than twice as concerned about AI as they are excited about it.

The future of AI safety

AI is a hugely hyped technology. But there is no doubt it is also extremely powerful and unpredictable. Understandably, this combination is very dangerous.

We cannot rely on regulations, as technology will develop more quickly than they can adapt. Nor can we rely on guardrails, as they will be bypassed.

We need a governance framework built for that eventuality: one that can predict and address the consequences of failure.

Such a framework must be global, participatory, and founded on reciprocal trust. These are things the current US administration has shown little capacity to generate.

The post “Why the US government shut down Anthropic’s latest Claude AI model” by Francesco Bailo, Senior Lecturer in Data Analytics in the Social Sciences, Deputy Director of the Centre for AI, Trust and Governance, University of Sydney was published on 06/15/2026 by theconversation.com