Update DeepResearcherAgent report handling and citation verification by AjayThorve · Pull Request #128 · NVIDIA-AI-Blueprints/aiq

AjayThorve · 2026-03-06T23:16:23Z

Introduced a new method _extract_report_content to streamline report content extraction from messages, accommodating cases where the last message is an AIMessage with a write_file tool call.
Updated _is_report_complete to utilize the new extraction method for improved report completeness checks.
Enhanced citation verification in SourceRegistry to handle ambiguous URL matches more robustly, including new strategies for resolving URLs based on query parameters and child-path matches.
Added unit tests to validate the new report extraction logic and citation resolution strategies, ensuring reliability in various scenarios.

These changes improve the agent's ability to generate and verify reports accurately, enhancing overall functionality.

- Introduced a new method `_extract_report_content` to streamline report content extraction from messages, accommodating cases where the last message is an AIMessage with a write_file tool call. - Updated `_is_report_complete` to utilize the new extraction method for improved report completeness checks. - Enhanced citation verification in `SourceRegistry` to handle ambiguous URL matches more robustly, including new strategies for resolving URLs based on query parameters and child-path matches. - Added unit tests to validate the new report extraction logic and citation resolution strategies, ensuring reliability in various scenarios. These changes improve the agent's ability to generate and verify reports accurately, enhancing overall functionality.

greptile-apps · 2026-03-06T23:22:10Z

Greptile Summary

This PR improves the DeepResearcherAgent in two main areas: (1) a new _extract_report_content helper that falls back to write_file tool-call arguments when the agent writes its report via a tool rather than returning text, and (2) a significantly expanded resolve_url in SourceRegistry that adds four new fuzzy-matching strategies (truncation, normalized prefix, child-path, and query-subset) to tolerate common LLM citation distortions. On the frontend, token-based WebSocket auth is removed in favour of a separate auth channel, and PDF hyphenation is disabled to keep URLs intact.

Key changes:

_extract_report_content: Extracts report text from the last message, with a fallback to write_file tool-call content when the text is below _MIN_REPORT_LENGTH (1500 chars). The shared constant removes the previous magic-number duplication.
resolve_url refactor: Adds 5 ordered matching strategies plus a _ParsedURL cache; deduplication via {e.url: e}.values() in Strategies 2–3 correctly handles raw+normalized dual-keys for the same entry.
Frontend auth cleanup: authToken option, updateAuthToken method, and the data-source filter that gated authenticated sources behind a valid idToken are all removed consistently across websocket-client.ts and use-websocket-chat.ts.
One logic concern: _extract_report_content's write_file fallback selects the longest content from all write_file calls in the final message without checking file_path. If the agent happens to write a long intermediate artifact and a shorter /report.md in the same turn, the wrong content will be extracted (see inline comment).

Confidence Score: 4/5

Safe to merge after addressing the write_file content-selection bug in _extract_report_content.
The citation-verification refactor is well-tested and the deduplication fixes are correct. The frontend auth removal is internally consistent. The one logic issue — the write_file fallback picking the longest content regardless of file path — is low-severity in practice (multiple write_file calls in a single final turn are unusual) but could silently cause incorrect report extraction in edge cases.
src/aiq_agent/agents/deep_researcher/agent.py — the _extract_report_content write_file content-selection heuristic.

Important Files Changed

Filename	Overview
src/aiq_agent/agents/deep_researcher/agent.py	Introduces `_extract_report_content` and a feedback-retry exception-handling block. The write_file content-selection heuristic (longest content wins) may pick up the wrong file when multiple write_file tool calls exist in the final turn.
src/aiq_agent/common/citation_verification.py	Significant refactor of `resolve_url` with 5 matching strategies and a new `_ParsedURL` cache. Deduplication in Strategies 2–3 is correct. Strategy 4 correctly uses segment-boundary checks. `_parsed_urls` uses normalized-only keys, avoiding the raw+normalized duplicate-ambiguity problem in `_urls`.
tests/aiq_agent/agents/deep_researcher/test_agent.py	Adds a test for the write_file tool-call extraction path. Only covers the single-write_file case; multi-write_file disambiguation (the bug described above) is not tested.
tests/aiq_agent/common/test_citation_verification.py	Adds thorough tests for the new resolution strategies. `test_resolve_url_query_subset_reordered_params` correctly isolates Strategy 5 (previous tests hit Strategy 2 first). Child-path depth requirement and domain-isolation tests are well-constructed.
frontends/ui/src/adapters/api/websocket-client.ts	Removes token-based WebSocket auth (query-param injection and `updateAuthToken` method). Change appears intentional per prior review discussion; remaining auth mechanism is now outside this client's scope.
frontends/ui/src/features/chat/hooks/use-websocket-chat.ts	Removes `idToken` usage and the data-source filter that excluded authenticated sources when no token was present. Consistent with the removal of token-based auth in the client. The `WEB_SEARCH_SOURCE_ID` import is cleanly removed alongside the dead filter.
frontends/ui/src/lib/pdf/ReactPdfDocument.tsx	Disables react-pdf's default English hyphenation callback so that long words (especially URLs) are never broken with hyphens in the exported PDF. Simple, well-commented change.
frontends/ui/src/features/chat/hooks/use-websocket-chat.spec.ts	Removes the `updateAuthToken` mock from the test fixture to match the removal of the method in the client. Straightforward cleanup.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Last Message] --> B{content length\n>= 1500?}
    B -- Yes --> C[Return text content]
    B -- No --> D{Is AIMessage\nwith tool_calls?}
    D -- No --> E[Return short content]
    D -- Yes --> F[Iterate write_file calls]
    F --> G{file_content longer\nthan current?}
    G -- Yes --> H[Update content\n⚠️ no file_path check]
    G -- No --> F
    H --> F
    F -- done --> I[Return content]

    subgraph resolve_url [SourceRegistry.resolve_url]
        S1{1. Exact match\nraw or normalized?}
        S1 -- hit --> R1[Return entry URL]
        S1 -- miss --> S2{2. Truncation:\nregistry URL\nstarts with report URL?}
        S2 -- 1 match --> R2[Return entry URL]
        S2 -- ambiguous/none --> S3{3. Prefix:\nnorm registry\nstarts with norm report?}
        S3 -- 1 match --> R3[Return entry URL]
        S3 -- ambiguous/none --> S4{4. Child-path:\nreport path under\nregistry path?}
        S4 -- 1 match --> R4[Return entry URL]
        S4 -- ambiguous/none --> S5{5. Query-subset:\nsame host+path,\nreport params ⊆ registry?}
        S5 -- 1 match --> R5[Return entry URL]
        S5 -- ambiguous/none --> RN[Return None]
    end

_{Last reviewed commit: 6254507}

src/aiq_agent/common/citation_verification.py

tests/aiq_agent/common/test_citation_verification.py

src/aiq_agent/common/citation_verification.py