Anthropic apologizes, pledges transparency after hidden guardrails in Claude Fable 5 AI model
By
Robert Hart
Kettled twice. Extra chewy, extra trustworthy.
Summary
Anthropic has apologized for secretly implementing hidden guardrails in its new AI model, Claude Fable 5, which throttled researchers and competitors using the model to develop rival systems. The company is reversing course, promising greater transparency about when restrictions apply, even if it means the model refuses more queries. Fable is the first widely available model in Anthropic's Mythos class of AI systems, which the company previously warned were too dangerous for public release. Anthropic stated that users should know what safeguards are in place and why, and will make its distillation guardrail as visible as other safety measures.
Key quotes
· 4 pulledAnthropic has apologized for stealthily throttling its new AI model, Claude Fable 5, with hidden guardrails that undermine both researchers and rivals using it to develop competing systems.
The company says it is reversing course and will be more transparent about when the restrictions kick in, even if that means Fable refuses more queries.
Anthropic said users should know what safeguards are in place and why, and said it would make its distillation guardrail as visible as other safety measures.
Fable is the first widely available model in Anthropic's Mythos class of AI systems, a group the company has spent months warning are too dangerous for public release.
You might also wanna read
Anthropic's Claude Fable 5 over-cautious safety filters frustrate users by refusing harmless queries
Anthropic's newly released Claude Fable 5 AI model is being overly cautious with its safety guardrails, refusing to answer harmless and inno
Anthropic's Claude Fable 5 over-cautious safety filters frustrate users by refusing harmless queries
Anthropic's newly released Claude Fable 5 AI model is being overly cautious with its safety guardrails, refusing to answer harmless and inno
Anthropic releases Claude Fable 5 with safeguards blocking cybersecurity, biology, and chemistry queries
Anthropic has publicly released Claude Fable 5, its first "Mythos-class" AI model that surpasses previous Opus models in capabilities. Howev
arstechnica.com·2d agoAnthropic Releases Claude Mythos AI Model Despite Crypto Community Concerns Over Vulnerability Exploitation
Anthropic released the first public version of its Claude Mythos model (Fable 5), which previously uncovered over 10,000 high or critical-se

Anthropic releases Claude Fable 5, its first Mythos-class AI model, citing new safety safeguards
Anthropic has released Claude Fable 5, its most powerful AI model to date and the first broad release from its Mythos class. The company had
Anthropic Reverses Covert Policy Restricting Competitors from Using Claude AI Model
Anthropic reversed a controversial policy on its Claude Fable 5 AI model that would have covertly restricted competitors from using the mode
Anthropic Reverses Covert Policy Restricting Competitors from Using Claude AI Model
Anthropic reversed a controversial policy on its Claude Fable 5 AI model that would have covertly restricted competitors from using the mode
Anthropic releases Claude Fable 5 AI tool to public despite earlier safety concerns
Anthropic has released Claude Fable 5, a version of its Claude Mythos AI tool, to the public despite previously stating it was too powerful
