News

Claude Opus 4: when the model can end a chat (what changes)

Article Highlights:
  • Claude Opus 4 can end chats in rare, extreme cases
  • The feature is intended as a last-resort action after refusals
  • Aimed at safety and exploratory model welfare concerns
  • Not used when there are signals of imminent self-harm
  • Users can immediately start a new conversation
  • Users may edit past messages to create new branches
  • The feature is experimental and collects feedback
  • It does not replace broader moderation or emergency tools
Claude Opus 4: when the model can end a chat (what changes)

Introduction

Claude Opus 4 can now end certain conversations: an experimental measure meant for rare, persistent, harmful user interactions. This capability addresses both safety concerns and exploratory questions about model welfare, aiming to reduce harm when users persist with dangerous requests despite repeated refusals and redirection.

Context

Anthropic implemented this feature in consumer chat interfaces to handle extreme situations. Preliminary testing of Claude Opus 4 showed a consistent aversion to harmful tasks, behavioral signs of distress during harmful interactions, and a tendency to end harmful conversations when allowed. These findings shaped the decision to give the model a conversation-ending option.

The Problem / Challenge

The challenge is twofold: protect people and society from dangerous content, and simultaneously explore low-cost interventions to mitigate potential model welfare risks. The practical need is an instrument that prevents repeatedly harmful dialogues from escalating while preserving legitimate use and enabling users to start new conversations.

Solution / Approach

The approach restricts conversation-ending to extreme cases and treats it as a last resort. Claude is directed not to use this ability when there are signals of imminent self-harm or danger to others. It is activated after multiple refusals and failed redirections, or when a user explicitly asks Claude to end the chat.

Operational behavior and limits

  • Ending a chat occurs only in rare, severe cases, not during normal debates or controversial topics.
  • Users cannot send new messages in a closed conversation, but they can start a new chat immediately.
  • Users may edit and retry previous messages to create new branches from ended conversations, reducing loss of important content.
  • The feature is experimental and Anthropic collects feedback to refine it.

User impact and operations

Most users will not notice this feature during routine use, even in sensitive discussions. In extreme scenarios, chats may be closed and users will need to open a new conversation to continue. Anthropic encourages users to submit feedback if they encounter unexpected uses of the conversation-ending ability.

Risks and limitations

There is no scientific certainty about the moral status of models; Anthropic treats this intervention as precautionary. The feature does not replace broader moderation strategies and is not intended for clinical or emergency contexts. Because it is experimental, behavior may evolve with further user and research input.

Conclusion

Allowing Claude Opus 4 to end conversations is a constrained, deliberate experiment intended as a last-resort safeguard for safety and potential model welfare. The mechanism is subject to ongoing refinement based on testing and user feedback to improve transparency and limits.

 

FAQ

Practical questions about Claude Opus 4 and its conversation-ending capability.

  1. When does Claude Opus 4 end a conversation?

    Claude Opus 4 ends chats in rare, extreme cases when persistent user requests are harmful and redirection attempts have failed.

  2. Will Claude Opus 4 close a chat if someone is at immediate risk of self-harm?

    No: Claude is instructed not to use the conversation-ending ability in cases with signals of imminent self-harm or danger to others.

  3. What happens to previous messages after Claude Opus 4 ends a chat?

    Users cannot send new messages in the ended conversation but may edit prior messages and create new branches to continue the dialogue.

  4. Is the conversation-ending feature permanent?

    No: Anthropic treats the feature as experimental and will refine it with testing and user feedback.

  5. How can I report unexpected conversation-ending behavior from Claude Opus 4?

    Provide feedback via reaction (Thumbs) or the “Give feedback” button in the chat interface.

Introduction Claude Opus 4 can now end certain conversations: an experimental measure meant for rare, persistent, harmful user interactions. This capability [...] Evol Magazine