fix(guardrails): prevent bypass of blocked Tool Result Policy when context is sensitive by wengkit218-pixel · Pull Request #4250 · archestra-ai/archestra

wengkit218-pixel · 2026-05-01T06:44:46Z

Summary

When an agent has \Treat context as sensitive from the start of chat\ enabled, blocked Tool Result Policies were being bypassed. The raw tool result was sent to the model instead of being replaced with the blocked message.

Root Cause

In \platform/backend/src/guardrails/trusted-data.ts, \evaluateIfContextIsTrusted()\ returned early around line 68 when \considerContextUntrusted\ was \ rue, so it skipped the real Tool Result Policy evaluation around line 125 and returned empty \ oolResultUpdates.

Fix

Instead of returning early when \considerContextUntrusted\ is \ rue, we now:

Set \contextIsTrusted = false\
Set \unsafeContextBoundary\ to mark the context as untrusted
Continue to evaluate Tool Result Policies for all tool calls

This ensures that blocked tool results are properly replaced with [Content blocked by policy]\ before being sent to the model.

Test Plan

To reproduce the bug:

Configure a tool like
ead_issue\
Set Tool Call Policy so the call is allowed
Set Tool Result Policy to \Blocked\
Enable \Treat context as sensitive from the start of chat\
Ask the agent to read an issue

Before fix: The raw tool result is sent in the model-facing LLM request.

After fix: The tool result is replaced with [Content blocked by policy]\ before being sent to the model.

/claim #4225

…ntext is sensitive When an agent has 'Treat context as sensitive from the start of chat' enabled, blocked Tool Result Policies were being bypassed. The function returned early when considerContextUntrusted was true, skipping the Tool Result Policy evaluation entirely. Root cause: In trusted-data.ts, evaluateIfContextIsTrusted() returned early around line 68 when considerContextUntrusted was true, so it skipped the real Tool Result Policy evaluation around line 125 and returned empty toolResultUpdates. Fix: Instead of returning early, we now set contextIsTrusted=false and unsafeContextBoundary, then continue to evaluate Tool Result Policies. This ensures blocked tool results are properly replaced with the blocked message before being sent to the model. Fixes archestra-ai#4225 /claim archestra-ai#4225

CLAassistant · 2026-05-01T06:44:57Z

All committers have signed the CLA.

algora-pbc Bot added the 🙋 Bounty claim label May 1, 2026

algora-pbc Bot mentioned this pull request May 1, 2026

Blocked Tool Result Policy bypass when agent starts in sensitive context #4225

Open

Merge branch 'main' into fix/blocked-tool-result-policy-bypass-4225

50ac6f8

github-actions Bot requested a review from iskhakov May 5, 2026 15:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(guardrails): prevent bypass of blocked Tool Result Policy when context is sensitive#4250

fix(guardrails): prevent bypass of blocked Tool Result Policy when context is sensitive#4250
wengkit218-pixel wants to merge 2 commits intoarchestra-ai:mainfrom
wengkit218-pixel:fix/blocked-tool-result-policy-bypass-4225

wengkit218-pixel commented May 1, 2026

Uh oh!

CLAassistant commented May 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wengkit218-pixel commented May 1, 2026

Summary

Root Cause

Fix

Test Plan

Uh oh!

CLAassistant commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLAassistant commented May 1, 2026 •

edited

Loading