feat(mcp): leadbay_scan_portfolio_signals — bulk portfolio signal scan + stale_at honesty fix (v0.19.1, product#3704)#94
Conversation
… scan + stale_at honesty fix Closes leadbay/product#3704. Two root causes, both fixed: 1. Capability gap — no way to scan a portfolio for a signal in bulk. JM built a 497-lead Monitor portfolio, asked which had an M&A signal since 2025, and the agent fell back to one research_lead_by_id call per lead (~60 calls) before abandoning the task. New composite leadbay_scan_portfolio_signals bulk-reads CACHED web_fetch signals across the portfolio (read-only GET fan-out, no quota burn), filters entries by a diacritic/case-folded query + optional `since` date, and returns the matched cohort campaign-ready. 2. stale_at hallucination — the agent inferred signal presence/absence from freshness fields and reported confident-but-wrong results. New snippets/gates/signal-honesty.md guardrail, included in pull-followups, research-lead-by-id, and the followup_check_in prompt: freshness markers are not signal indicators; route portfolio-wide signal questions to the bulk tool; unresearched leads are reported (not_researched), never fabricated. The tool separates "no matching signal" (researched, no match) from "not yet researched" (no data to search) via a not_researched[] bucket — the structural antidote to the original hallucination. Verified live against the US test account end-to-end through the MCP server: one scan_portfolio_signals call over 169 leads correctly surfaced genuine post-2025 acquirers, with the agent reading the signal text to discard false-positive senses of "acquisition" and reporting 0 unresearched. - Extracts web_fetch reshaping into shared _web-fetch-helpers.ts (reused by research-lead-by-id). - Registered in compositeReadTools, _composite-file-names, TOOLS_WITH_ROUTING, output-schema conformance, WORKFLOWS.md. - Unit tests (new file) cover match/since/diacritics/not_researched/429/cap. - Two eval scenarios authored (underdeliver + honesty); runner glue not yet on this branch, so they are fixture-ready, not wired. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3ab7be1 to
c5df963
Compare
…b.com-leadbay-product-issues-3704
Pre-landing review surfaced three places where partial coverage was silently reported as complete, all on the #3704 honesty axis: - a non-quota web_fetch read failure (404/500/network) was dropped from both matched[] and not_researched[] while still counted in scanned_count — restructured the fan-out to catch per-lead so failed reads land in not_researched with the lead name. - a 429 while paging /monitor never set quota_exceeded, so a quota wall during portfolio enumeration returned an empty result as if 'no matches' — now flagged. - a swallowed POST /monitor/filter still sent filtered=true, scanning a stale server-side cohort — now falls back to an unfiltered scan. Also dropped a dead sinceMs binding and a redundant double-Boolean. Adds 5 regression tests (404 read, Monitor pagination path, 429 while paging, filter-POST failure fallback, ambiguous_locations).
leadbay_scan_portfolio_signals (bulk portfolio signal scan) + the signal-honesty guardrail across pull_followups / research_lead_by_id / followup_check_in. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
milstan
left a comment
There was a problem hiding this comment.
This is a suggestion, not a requirement. See if you want to follow it - and feel free to merge it once you decide what to do (I agree on everything else, and this comment being optional...)
| filters them for you. Do NOT loop \`leadbay_research_lead_by_id\` one lead at a | ||
| time, and do NOT guess from list-level freshness flags. | ||
|
|
||
| If a lead has no cached signal content, say so honestly — "not yet researched, |
There was a problem hiding this comment.
You are here giving instructiosn to the user's agent which presumably has access to internet and can maybe also be leverdged to do some extra web serachers and complete the signal Leadbay already has.
You may consider amake it more complex here - telling it to identify where such extra websearches are needed, do them, and refine. Like an extra step of the prompt.
There was a problem hiding this comment.
Good call — done in 0918f85. PHASE 3b now turns the coverage gap into a refinement loop instead of just reporting it:
- Names the gap precisely — "N matched; K have no cached signal and J more have only a thin/undated mention".
- Targeted live pass — if the agent has web-search tools, it researches only the
not_researched/ thin-signal leads for the exact signal asked about (<Company> acquisition 2025), not the whole portfolio. - Folds findings back in, clearly labelled — live results are shown as agent-sourced (not Leadbay-verified), in a section separate from the campaign-ready cached cohort, with source URL/date cited. They never silently merge into the verified cohort.
- Offers the durable path —
leadbay_bulk_qualify_leadsfor gap leads worth persisting, so Leadbay runs its ownweb_fetchand the signal lands in cachedsignals[]on the next scan.
The labelling is deliberate: it keeps the #3704 honesty guarantee intact (Leadbay's cached signals[] stay the source of truth) while still answering the question for leads Leadbay hasn't researched yet.
Note: your comment landed on prompts.generated.ts, which is emitted by promptforge — the real edit is in packages/promptforge/prompts/leadbay_followup_check_in.md.tmpl and regenerated from there.
Addresses Milan's PR review: the agent has web tools and can complete the signal Leadbay hasn't cached yet. PHASE 3b now tells the agent to name the coverage gap precisely, run a TARGETED live web pass on only the not_researched / thin-signal leads for the exact signal asked about, and fold findings back in clearly labelled as agent-sourced (not Leadbay-verified) — separate from the campaign-ready cached cohort, with source citations. Also offers leadbay_bulk_qualify_leads as the durable path that writes the signal into Leadbay's cached signals[]. Keeps the #3704 honesty guarantee intact: Leadbay's cached signals[] remain the source of truth; live findings never silently merge into the verified cohort.
…b.com-leadbay-product-issues-3704 # Conflicts: # WORKFLOWS.md # packages/mcp/CHANGELOG.md # packages/mcp/package.json # packages/mcp/server.json
Resolve conflicts after main advanced to 0.19.0 (#93 contacts, #95 server-json): - WORKFLOWS.md: keep main's contact rows (15-23) + append new scan_portfolio_signals row (24); renumber. - research_lead_by_id description: trim 50 chars of redundant prose to absorb main's new add_contact anti-trigger (auto-emitted into WHEN TO USE), restoring the 17000-char budget (now 16973). - Regenerate tool-descriptions.generated.ts. Version stays 0.19.1 (main 0.19.0 + patch). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
[Claude]: Resolved merge conflicts with main (now 0.19.0; #93 contacts + #95 server-json) and pushed to this branch.
Full gate green on the merged state: core 380, promptforge 16, mcp 376; typecheck + build clean. Squash-merging once CI is green; that fires auto-tag → release.yml for |
Fixes the two root causes in leadbay/product#3704 — JM couldn't scan his 497-lead Monitor portfolio for an M&A signal without one full-profile API call per lead, and the agent then hallucinated coverage off
stale_at.Summary
leadbay_scan_portfolio_signals— scans a Monitor portfolio (or explicitleadIds) and bulk-reads cachedweb_fetchsignals: aGET-only fan-out (noPOST, no AI-qualification quota burn), capped at 5 concurrent by the client semaphore. Filters signal entries by a case- and accent-foldedquery(OR terms) + optionalsincedate. Returns the matched cohort campaign-ready (lead_id,name,location, quotedmatched_signals), feeding straight intoadd_leads_to_campaign. Separatesnot_researched[](no cached content) from "no match" — the structural antidote to the hallucination.stale_athonesty guardrail — newsnippets/gates/signal-honesty.md, included inpull_followups,research_lead_by_id, and thefollowup_check_inprompt: freshness fields are not signal indicators; portfolio-wide signal questions route to the bulk tool; unresearched leads are reported honestly._web-fetch-helpers.ts—splitEmojiSection/reshapeWebFetchContent/SECTION_PRIORITYextracted fromresearch-lead-by-id.ts, now shared by both consumers (verbatim move, no divergent copy).index.ts,_composite-file-names.ts,TOOLS_WITH_ROUTING, output-schema conformance,WORKFLOWS.md. Bumps@leadbay/mcp0.18.2 → 0.18.3.Pre-Landing Review
Ran the full
/reviewpipeline (critical pass + testing & maintainability specialists + Claude adversarial + Codex adversarial). Three critical findings fixed, all on the #3704 coverage-honesty axis — each was a swallowed error path that reported partial coverage as complete:matched[]andnot_researched[]while still counted inscanned_count, violating the documentedscanned_count = matched + non-matching + not_researchedinvariant. Now restructured to catch per-lead so failed reads land innot_researchedwith the lead name./monitornever setquota_exceeded— a quota wall during portfolio enumeration returned an empty result as if "no matches". Now flagged.POST /monitor/filterstill sentfiltered=true— scanning against a stale server-side cohort. Now falls back to an honest unfiltered scan.Plus two mechanical cleanups (dead
sinceMsbinding, redundant double-Boolean). No security / SQL / injection findings (read-only GET fan-out, no user-controlled SQL).Test Coverage
New unit suite at
packages/core/test/unit/composite/scan-portfolio-signals.test.ts— 13 tests: happy path,not_researchedseparation,sincefilter, diacritic/case folding, no-match, empty query,max_leadscap, 429 mid-read, 404 read → not_researched, Monitor pagination path (geo resolve → filter → paginate → bulk-read), 429 while paging → quota_exceeded, filter-POST failure → unfiltered fallback, ambiguous_locations early-return. The 5 bolded tests were added during review.Verification
pnpm prompts:build,pnpm -r typecheck,pnpm -r testall green on the merged-with-main state: core 357, promptforge 16, mcp 376.server.jsonversion-alignment audit passes (0.18.3 acrosspackage.json+ bothserver.jsonfields;@leadbay/mcp@0.18npx pins unchanged on the 0.18 line).scan_portfolio_signalscall over 169 leads surfaced genuine post-2025 acquirers (e.g. QUEST DRAPE, LLC — acquired Drape Kings, March 2025), reading signal text to discard false-positive senses of "acquisition", reporting0unresearched. No per-lead research loop.Adversarial Review
Codex (gpt-5.5, high reasoning) found the
quota_exceeded-on-pagination and stale-filter issues independently; both fixed above. Remaining Codex notes (query-term substring over-match on very short terms,max_leadsNaN/negative) are agent-controlled inputs constrained by the input schema — informational, not blocking.Plan Completion
All #3704 deliverables shipped: bulk scan tool, honesty separation of not_researched, freshness-vs-signal guardrail across the three prompts. Eval scenarios are fixture-ready (underdeliver + honesty) but not wired to run — the eval scenario-execution glue does not exist on this branch yet (documented in the scenario folder README).
closes https://github.com/leadbay/product/issues/3704