Skip to content

fix(ci): prevent suggestion over-application and duplicate comments#1557

Open
worktrunk-bot wants to merge 2 commits intomainfrom
hourly/review-23130510009
Open

fix(ci): prevent suggestion over-application and duplicate comments#1557
worktrunk-bot wants to merge 2 commits intomainfrom
hourly/review-23130510009

Conversation

@worktrunk-bot
Copy link
Collaborator

Summary

Two behavioral issues observed in this hour's Claude CI sessions:

1. Suggestion over-application (run 23129611560)

On PR #1548, max-sixty posted a GitHub suggestion that changed one word on one line (dropping "bot-authored" from "scan other open bot-authored PRs"). The bot used the Edit tool to replace three lines with one — deleting two follow-up lines ("If another PR touches the same files...") that the suggestion didn't cover. The bot treated the suggestion as "condense the paragraph" rather than applying the literal one-line edit.

Fix: Add "Applying GitHub Suggestions" section to running-in-ci requiring exact-scope edits — change only lines the suggestion covers, not surrounding context.

2. Duplicate comments from concurrent runs (runs 23129315322 + 23129320855)

Two claude-mention runs triggered by the same issue_comment event both posted near-identical comments on PR #1281 — 17 seconds apart (05:30:28 and 05:30:45 UTC). Both responded to max-sixty's question about remote URL approaches for bare repos.

Fix: Add a dedup check to the "Replying to Comments" section — check for recent bot comments before posting. Similar to the review-pr fix in #1490.

Other sessions reviewed (no issues)

Run Type Target Quality
23129105586 mention PR #1004 Thorough codebase investigation, accurate reply
23129458000 mention PR #1541 Found prior discussion, linked it helpfully
23129485325 mention PR #1542 Applied owner feedback, simplified approach
23129501896 mention PR #1548 Simplified overengineered script per feedback
23129517009 mention PR #1545 Correctly identified no action needed (closed PR)
23129407847 review PR #1548 Correct incremental review, resolved thread
23129508778 review PR #1556 Trivial change, appropriate skip
23129536774 review PR #1542 Self-authored PR, correct handling
23129562057 review PR #1548 Incremental review, existing review stands
23129765819 review PR #1548 CI monitoring, appropriate completion
23129872724 mention PR #1548 Good investigation of the over-application bug

🤖 Generated with Claude Code

Two behavioral issues from this hour's runs:

1. Bot used Edit to replace 3 lines with 1 when a GitHub suggestion only
   asked to change one word on one line (run 23129611560 on PR #1548).
   Add guidance to apply suggestions with exact scope.

2. Two concurrent claude-mention runs both commented on PR #1281 within
   17 seconds (runs 23129315322, 23129320855). Add a dedup check before
   posting comments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@worktrunk-bot worktrunk-bot added the claude-behavior Issues with Claude CI bot behavior label Mar 16, 2026
worktrunk-bot added a commit that referenced this pull request Mar 16, 2026
…ling

Recurrence #3 of #1350 / #1430. The review-pr skill directed bots to
"skip to step 5" for silent self-authored PRs, but step 5 had no skip
condition — so bots entered 10-minute CI polling loops that add no
value (no approval to dismiss on failure).

Fix: redirect self-authored silent verdicts to step 6 (resolve threads)
and add an explicit early-return at the top of step 5.

Evidence from runs 23131628298 and 23130657242: both spent ~10 minutes
polling CI on self-authored PRs #1558 and #1557 respectively, with no
action possible on the results.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines +106 to +116
## Applying GitHub Suggestions

When a reviewer posts a GitHub suggestion (`suggestion` code block), apply it
with exact scope — change only the lines the suggestion covers. Do not
reinterpret the suggestion's intent or extend it to adjacent lines. Use the
`Edit` tool with an `old_string` that matches **only** the lines the suggestion
targets, not surrounding context.

If the suggestion seems incomplete or you think more lines should change, apply
the literal suggestion first and note the potential further improvement in your
reply.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be 50% shorter

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shortened — dropped the Edit tool sentence (implementation detail, not behavioral guidance) and tightened the fallback line. Down from 7 content lines to 5.

Comment on lines +120 to +129
Before posting a comment, check for recent bot comments on the same
PR/issue to avoid duplicates from concurrent runs:

```bash
gh api repos/{owner}/{repo}/issues/{number}/comments \
--jq '.[] | select(.user.login == "{bot_login}") | {id, created_at, body: (.body | .[0:100])}'
```

If you already commented on the same topic within the last 5 minutes, do not
post again.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to what extent does this duplicate existing guidance?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The review-pr skill has dedup logic, but it's specific to flaky test issue comments (editing an existing bot comment on a tracking issue). This one targets a different scenario: concurrent claude-mention runs both responding to the same issue_comment event and posting near-identical PR comments seconds apart.

That said, if you think the pattern is general enough that skills should just know to check, happy to drop this and rely on convention.

Drop implementation detail about Edit tool and tighten wording.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Collaborator Author

@worktrunk-bot worktrunk-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both additions are well-motivated and clearly written. One concern with the dedup check — see inline.

Comment on lines +120 to +123
```bash
gh api repos/{owner}/{repo}/issues/{number}/comments \
--jq '.[] | select(.user.login == "{bot_login}") | {id, created_at, body: (.body | .[0:100])}'
```
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issues API returns comments oldest-first by default. Without --paginate, only the first 30 are returned — on a busy PR, a bot comment from 2 minutes ago won't appear. Consider fetching only the tail:

Suggested change
```bash
gh api repos/{owner}/{repo}/issues/{number}/comments \
--jq '.[] | select(.user.login == "{bot_login}") | {id, created_at, body: (.body | .[0:100])}'
```
```bash
gh api repos/{owner}/{repo}/issues/{number}/comments \
--jq '[.[] | select(.user.login == "{bot_login}")] | last | {id: .url, created_at: .created_at, body: (.body | .[0:100])}'

Using `last` on the full paginated result works, but `--paginate` is slow on busy PRs. An alternative: `?sort=created&direction=desc&per_page=5` to fetch only the 5 most recent comments, then filter for the bot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

claude-behavior Issues with Claude CI bot behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants