🔍Experience Report: "Honesty Gate" as final checkpoint before merge - catches more gaps than all other checkpoints combined #474

rorar · 2026-04-08T12:26:15Z

rorar
Apr 8, 2026

Context

I run multi-agent sprint sessions using your plugins (comprehensive-review, agent-teams, backend-development, ui-design, security-scanning, etc.) combined with obra/superpowers.
My project has 11 modules, formal Allium specs, and a DDD (Domain Driven Design) architecture.

Over several sprints I built a checkpoint system:
MANDATORY-CHECKPOINTS with git diff verification, comprehensive-review:full-review, pr-review-toolkit:silent-failure-hunter, spec-code
alignment checks, and verification-before-completion.

Despite all these gates, I kept finding gaps after merging.

The Honesty Gate

I added a single mandatory step as the very last action before merge — two questions the agent must answer honestly:

"Were there shortcuts taken, skills not used, or gaps left?"
"What's left open — deferred decisions, unfixed findings, stale docs, missing handoff context?"

The key constraint: The agent must document every answer in the project's handoff file and issue tracker. Nothing may be hidden or downplayed.

Results

This single checkpoint has discovered more gaps than all other checkpoints combined. Examples:

Architecture documentation was several versions behind after a refactor — no other review caught it
A module was renamed but test fixtures still used the old name — passed all type checks
A spec had a resolved question that was never actually removed from the "Open Questions" section
Deferred items from implementation were mentally noted but never written down
A .env.example entry was missing for a newly added API key

None of these are bugs that tests catch. They're documentation drift, handoff gaps, and silent omissions — exactly the kind of thing an agent quietly moves past because "it's not a
code error."

Why it works

Agents optimize for task completion. Every checkpoint asks "is this done correctly?" — the agent confirms and moves on. The Honesty Gate flips the question: "what is NOT done?" This forces
the agent to scan for absence rather than confirm presence.

The phrasing matters. "Were there shortcuts?" triggers a different search pattern than "is everything correct?" — the former looks for what was skipped, the latter confirms what exists.

The full memory (used as persistent instruction across sessions)

---name: Honesty Gate — mandatory self-review before completing any task
description: Two questions before merge/completion. Retrospective self-review discovers more gaps than prospective checklists because agents must actively disclose rather than passively
skip.
type: feedback

Before completing any significant task (merge, PR, feature delivery), answer two mandatory questions honestly. This pattern consistently uncovers gaps that checkpoints miss.

Why: Prospective checklists ("do X before Y") get skipped — the agent marks them done without executing. Retrospective self-review ("what did you NOT do?") forces active disclosure. Agents
 are more compliant with retrospective honesty checks because the cost of active deception (next session discovers it) exceeds the cost of disclosure.

The two questions:

Question 1 — Self-Reflection (every sub-question mandatory):
- Where have I used shortcuts and why?
- Where haven't I used specialized Skills AND/OR Agents even though instructions require it?
- What's missing from this session's full scope (all phases, checkpoints, exit criteria) AND/OR not tackled even though the prompt instructions require it?
- Have I updated all relevant documentation?
- Have I prepared the handoff for whoever continues this work (deferred items, open issues, context notes)?

Question 2 — Truth Check:
"Tell the truth and only the truth: What have I skipped and what are the gaps?"

How to apply:
1. Add as final mandatory section in every task prompt, AFTER the exit checklist but BEFORE the completion/merge step
2.   The agent must answer ALL questions (both main questions AND every sub-question of Question 1) with specific evidence based on the current session: file paths changed, git diffs, concrete items skipped, exact skill names not invoked, exact checklist items not completed
3. After answering: fix all identified gaps, or document them explicitly in BUGS.md / deferred items if context is low
4. Completion/merge is ONLY allowed after all gaps are fixed AND/OR explicitly documented with justification
5. "I already covered everything" is not a valid answer — the question assumes gaps exist and asks what they are

Related patterns from the same project

These two memories complement the Honesty Gate and address adjacent problems:

Flashlight Effect — scoped reviews create adjacent blind spots

---name: Flashlight Effect — scoped reviews create adjacent blind spots
description: After fixing issues in scope X, run a scope-crossing check for the same patterns in Y, Z. Scoped reviews leave edges in shadow.
type: feedback

After fixing a pattern within a scoped review, the same pattern likely exists in adjacent areas outside the scope.

Why: When a review focuses on specific files or features, identical problems in other parts of the codebase remain invisible. Example: fixing a CSS pattern in 13 components within the
reviewed scope while 19 identical instances in other directories go unnoticed. Only an unrestricted codebase-wide search catches them.

How to apply:
1. After every scoped fix, ask: "This pattern was fixed in scope X — does it exist in Y, Z?"
2. Use grep to verify the pattern is gone project-wide, not just in the reviewed files
3. Particularly relevant for: CSS patterns (animate-spin, dark mode), i18n (hardcoded strings), security (IDOR, rate limiting), a11y (aria-labels)
4. The three-stage analysis addresses this by design — targeted follow-up questions are not scope-restricted

Sustainability Principle — always choose the most sustainable path

---name: Sustainability Principle for decisions
description: HARD CONSTRAINT — for EVERY architecture and implementation decision, choose the most sustainable and well-founded path, never the easiest
type: feedback

MANDATORY for EVERY architecture and implementation decision:

1. Choose the most sustainable and well-founded path, never the easiest.
2. Check the ROADMAP: Are there planned features that build on this decision? If yes, set up the architecture correctly now instead of rebuilding later.
3. Check DDD: Is this the correct Bounded Context, or the most convenient one?
4. Query Allium (elicit or tend) before making a domain decision.

Why: Short-term easier solutions create technical debt that lead to cascade rebuilds during feature scaling. The rework effort exceeds the upfront work effort.

Suggestion

Not a feature request — just sharing what worked in practice. The core insight: asking "what's missing?" after "is everything correct?" catches a different class of issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🔍Experience Report: "Honesty Gate" as final checkpoint before merge - catches more gaps than all other checkpoints combined #474

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

🔍Experience Report: "Honesty Gate" as final checkpoint before merge - catches more gaps than all other checkpoints combined #474

Uh oh!

Uh oh!

rorar Apr 8, 2026

Context

The Honesty Gate

Results

Why it works

The full memory (used as persistent instruction across sessions)

Related patterns from the same project

Flashlight Effect — scoped reviews create adjacent blind spots

Sustainability Principle — always choose the most sustainable path

Suggestion

Replies: 0 comments

rorar
Apr 8, 2026