If your conversations with Claude go off the rails, Anthropic will pull the plug.
Going forward, Anthropic will end conversations in “extreme cases of persistently harmful or abusive user interactions.” The feature, available with Claude Opus 4 and 4.1, is part of an ongoing experiment around “AI welfare,” and will continue to be refined, Anthropic says.
Claude won’t end chats if it detects that the user may inflict harm upon themselves or others. As The Verge points out, Anthropic works with ThroughLine, an online crisis support group, to determine how models should respond to situations related to self-harm and mental health.
“We feed these insights back to our training team to help influence the nuance in Claude’s responses, rather than having Claude refuse to engage completely or misinterpret a user’s intent in these conversations,” Anthropic says.
In any case, Claude will use “its conversation-ending ability as a last resort when multiple attempts at redirection have failed and hope of a productive interaction has been exhausted, or when a user explicitly asks Claude to end a chat,” Anthropic adds.
An example of a chat ending at the user’s request (Credit: Anthropic)
Once a chat ends, users won’t be able to continue in the same thread, though they can start a new one right away. Additionally, to avoid losing important elements of a conversation, users will be able to edit messages of a closed thread and continue in a new one. Since the feature is still in its early stages, Anthropic will also accept feedback on instances where Claude has misused its conversation-ending ability.
In early tests with sexual content, though, Opus 4 showed a pattern of distress when engaging with real-world users seeking harmful content and a preference for ending harmful conversations when being given the option to do that in “simulated user interactions.”
Recommended by Our Editors
Last week, Anthropic also updated Claude’s usage policy, forbidding users from using it to develop harmful things, from chemical weapons to malware.
While AI chatbots have many constructive use cases, they can sometimes be creepy or offer dangerous advice. Companies are still figuring out how to tune and prepare their chatbots for such sensitive or harmful requests. OpenAI, for example, is still working on ways to improve ChatGPT’s behavior toward personal questions and signs of mental distress.
5 Ways to Get More Out of Your ChatGPT Conversations
Like what you’re reading? Don’t miss out on our latest stories. Add PCMag as a preferred source on Google.
Get Our Best Stories!
Your Daily Dose of Our Top Tech News
By clicking Sign Me Up, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.
Thanks for signing up!
Your subscription has been confirmed. Keep an eye on your inbox!