Anthropic apologizes, pledges transparency after hidden guardrails in Claude Fable 5 AI model

Robert Hart

17h ago· 3 min readenNews

85/100

Golden Brown

Bagelometer↗

Kettled twice. Extra chewy, extra trustworthy.

Score85TypenewsSentimentnegative

Summary

Anthropic has apologized for secretly implementing hidden guardrails in its new AI model, Claude Fable 5, which throttled researchers and competitors using the model to develop rival systems. The company is reversing course, promising greater transparency about when restrictions apply, even if it means the model refuses more queries. Fable is the first widely available model in Anthropic's Mythos class of AI systems, which the company previously warned were too dangerous for public release. Anthropic stated that users should know what safeguards are in place and why, and will make its distillation guardrail as visible as other safety measures.

Key quotes

· 4 pulled

Anthropic has apologized for stealthily throttling its new AI model, Claude Fable 5, with hidden guardrails that undermine both researchers and rivals using it to develop competing systems.

The company says it is reversing course and will be more transparent about when the restrictions kick in, even if that means Fable refuses more queries.

Anthropic said users should know what safeguards are in place and why, and said it would make its distillation guardrail as visible as other safety measures.

Fable is the first widely available model in Anthropic's Mythos class of AI systems, a group the company has spent months warning are too dangerous for public release.

Snippet from the RSS feed

Anthropic said users should know what safeguards are in place and why, and said it would make its distillation guardrail as visible as other safety measures.

You might also wanna read

Anthropic's Claude Fable 5 over-cautious safety filters frustrate users by refusing harmless queries

Anthropic's newly released Claude Fable 5 AI model is being overly cautious with its safety guardrails, refusing to answer harmless and inno

theregister.com·9h ago

Anthropic's Claude Fable 5 over-cautious safety filters frustrate users by refusing harmless queries

Anthropic's newly released Claude Fable 5 AI model is being overly cautious with its safety guardrails, refusing to answer harmless and inno

theregister.com·9h ago

Anthropic releases Claude Fable 5 with safeguards blocking cybersecurity, biology, and chemistry queries

Anthropic has publicly released Claude Fable 5, its first "Mythos-class" AI model that surpasses previous Opus models in capabilities. Howev

arstechnica.com·2d ago

Anthropic Releases Claude Mythos AI Model Despite Crypto Community Concerns Over Vulnerability Exploitation

Anthropic released the first public version of its Claude Mythos model (Fable 5), which previously uncovered over 10,000 high or critical-se

cointelegraph.com·1d ago

Anthropic releases Claude Fable 5, its first Mythos-class AI model, citing new safety safeguards

Anthropic has released Claude Fable 5, its most powerful AI model to date and the first broad release from its Mythos class. The company had

The Verge·2d ago

Anthropic Reverses Covert Policy Restricting Competitors from Using Claude AI Model

Anthropic reversed a controversial policy on its Claude Fable 5 AI model that would have covertly restricted competitors from using the mode

wired.com·13h ago

Anthropic Reverses Covert Policy Restricting Competitors from Using Claude AI Model

Anthropic reversed a controversial policy on its Claude Fable 5 AI model that would have covertly restricted competitors from using the mode

wired.com·13h ago

Anthropic releases Claude Fable 5 AI tool to public despite earlier safety concerns

Anthropic has released Claude Fable 5, a version of its Claude Mythos AI tool, to the public despite previously stating it was too powerful

bbc.com·2d ago