All Topics

Technology

Art

First reported by The Verge

Claude AI Now Ends Harmful or Abusive Conversations as a Last Resort

Claude Opus 4 and 4.1 Gain Ability to End Harmful Conversations

virgildotcodes

9mo ago· 3 min readenNews

75/100

Toasty

Bagelometer↗

Plain bagel done well. Pleasantly substantive.

Score75TypenewsSentimentneutral

Summary

Claude Opus 4 and 4.1 now have the ability to end conversations in rare cases of harmful or abusive interactions, as part of exploratory research on AI welfare and model alignment. The feature reflects ongoing uncertainty about the moral status of AI models like Claude.

Key quotes

· 3 pulled

We recently gave Claude Opus 4 and 4.1 the ability to end conversations in our consumer chat interfaces.

This ability is intended for use in rare, extreme cases of persistently harmful or abusive user interactions.

We remain highly uncertain about the potential moral status of Claude and other LLMs, now or in the future.

Snippet from the RSS feed

An update on our exploratory research on model welfare

You might also wanna read

Claude AI Now Ends Harmful or Abusive Conversations as a Last Resort

Anthropic's Claude AI chatbot now has the capability to end conversations deemed 'persistently harmful or abusive,' particularly in its Opus

The Verge·9mo ago

Anthropic Releases Claude Opus 4.8 With Focus on Honesty and Reducing Unsupported Claims

Anthropic has released Claude Opus 4.8, an updated version of its flagship AI model that is specifically trained to be more honest and trans

entrepreneur.com·1d ago

Anthropic releases Claude Opus 4.8 with focus on AI model honesty and uncertainty awareness

Anthropic is releasing Claude Opus 4.8, a new AI model that emphasizes "honesty" as a key feature. The company trains its models to avoid ma

The Verge·3d ago

Anthropic releases Claude Opus 4.8, emphasizing honesty and reliability over raw performance

Anthropic has released Claude Opus 4.8, a new large language model that prioritizes honesty and carefulness over raw performance. The model

zdnet.com·2d ago

Claude Opus 4.6: Anthropic's Most Advanced AI Model for Agentic Tasks and Complex Reasoning

Claude Opus 4.6 is Anthropic's most advanced AI model, designed specifically for complex agentic tasks, deep reasoning, and handling large c

Product Hunt·3mo ago

Anthropic Updates Claude AI Usage Policy to Ban Weapon Development

Anthropic has updated the usage policy for its Claude AI chatbot to include stricter cybersecurity rules and explicitly ban the development

The Verge·9mo ago