Skip to content

Preserve thinking/redacted_thinking blocks through forward + reverse transforms#28

Open
danaimone wants to merge 1 commit intozacdcook:masterfrom
danaimone:fix/preserve-thinking-blocks
Open

Preserve thinking/redacted_thinking blocks through forward + reverse transforms#28
danaimone wants to merge 1 commit intozacdcook:masterfrom
danaimone:fix/preserve-thinking-blocks

Conversation

@danaimone
Copy link
Copy Markdown

Summary

When a request has extended thinking enabled and the conversation contains any prior assistant turns with thinking / redacted_thinking content blocks, the next turn fails with:

messages.N.content.M: thinking or redacted_thinking blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.

Anthropic enforces byte-identical echo of thinking blocks on the latest assistant message. The proxy's string transformation pipeline mutates them in two places.

Root cause

Forward pass (processBody) — Layer 2 replacements, Layer 3 tool renames, and Layer 6 property renames all run as split/join across the full request body. If a prior assistant thinking block contains any rewritten substring (openclaw, HEARTBEAT, prometheus, or any "quoted" tool/property name), it gets rewritten alongside everything else. Anthropic then rejects.

SSE reverse pass (reverseMap in the streaming handler) — the more common failure mode. When Anthropic streams a response containing thinking_delta events, the tail-buffered reverseMap mutates them on the way out. The client stores the mutated bytes and echoes them back on the next turn. Anthropic compares against what it originally sent and rejects. So even if the forward pass were clean, the reverse pass alone corrupts history and breaks the next turn.

Non-streaming JSON response — same issue, same fix.

Repro

  1. Send a request with thinking enabled whose response contains a thinking block mentioning any reverse-mapped token (very common — the model often reasons about project/tool names).
  2. The SSE reverseMap rewrites e.g. ocplatformopenclaw inside a thinking_delta.
  3. Client stores the mutated thinking block in its history.
  4. Next turn: proxy forwards the stored bytes to Anthropic.
  5. Anthropic rejects with the error above, and every retry from that conversation fails the same way.

Fix

Forward pass — add maskThinkingBlocks / unmaskThinkingBlocks helpers that scan for {"type":"thinking"...} and {"type":"redacted_thinking"...} content blocks with string-aware bracket matching, replace each with a unique placeholder (__OBP_THINK_MASK_<n>__) before transforms run, and restore after. The placeholder sigil is chosen so no existing replacement / tool rename / property rename can match it.

SSE reverse pass — switch from tail-buffered chunk flushing to SSE-event-aware buffering (split on \n\n), track the current content block type across events via a state machine (content_block_start → set, content_block_stop → clear), and pass content_block_* events unchanged while the current block type is thinking or redacted_thinking. Reverse-map everything else as before. Bonus: event-complete buffering also subsumes the cross-chunk pattern fix from #11 since SSE events are self-contained, so patterns can't span event boundaries.

Non-streaming JSON response — wrap reverseMap in the same mask/unmask as the forward pass.

Notes

  • Tested locally against the current master: extended-thinking conversations that previously failed on every second turn now succeed indefinitely.
  • The reverse-mapping of thinking content was arguably always semantically wrong — thinking blocks are the model's internal reasoning and shouldn't be rewritten in either direction. The byte-equality requirement just turns a latent bug into a hard failure.
  • Existing conversations whose history already contains mutated thinking bytes can't be salvaged (the stored bytes don't match what Anthropic has on file); they need to be truncated to before the affected turn or reset. New conversations work correctly after this patch.
  • Zero dependencies, matches existing code style, no version bump (leaving that to your release flow).

Test plan

  • Run against an extended-thinking conversation that previously failed on turn 2 and verify it now completes turns 2+ without the "thinking blocks cannot be modified" error
  • Verify non-thinking conversations still work unchanged
  • Verify the representative-claim header is still five_hour (billing classification unchanged — this PR only affects body/stream handling, not the billing layer)
  • Verify SSE streaming still reverses non-thinking content correctly (tool names, property names, OC strings)

…erse transforms

Anthropic requires thinking/redacted_thinking content blocks to be echoed back
byte-identical to what the model originally produced. Any mutation triggers
"thinking or redacted_thinking blocks in the latest assistant message cannot
be modified. These blocks must remain as they were in the original response."

The proxy's string transformation pipeline was mutating these blocks in two
places:

  1. Forward pass (processBody): Layer 2/3/6 split/join runs across the full
     request body and rewrites any occurrence inside prior assistant thinking
     blocks in history.

  2. Reverse pass (reverseMap, SSE path): the tail-buffered flush rewrites
     thinking_delta bytes on the way out. The client stores the mutated bytes
     and echoes them on the next turn, and Anthropic rejects the comparison.

Fix:
  - Add maskThinkingBlocks / unmaskThinkingBlocks helpers that scan for
    {"type":"thinking"...} and {"type":"redacted_thinking"...} content blocks
    with string-aware bracket matching, replace each with a unique placeholder
    before transforms run, restore after.
  - Wrap processBody with mask/unmask so Layer 2/3/6 never see assistant
    thinking history.
  - Replace the SSE tail-buffer flush with event-complete buffering (split on
    \n\n) and track the current content block type across events. Pass
    thinking/redacted_thinking events through unchanged; reverse-map
    everything else. This also subsumes the cross-chunk pattern fix from zacdcook#11
    since SSE events are self-contained.
  - Wrap the non-streaming JSON response in the same mask/unmask around
    reverseMap.

The reverse-mapping of thinking content was arguably always incorrect —
thinking blocks are the model's internal reasoning and shouldn't be rewritten
in either direction. The byte-equality requirement just turns a latent bug
into a hard failure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant