Skip to content

fix: PR #151 review findings — thread-safety, fail-closed availability, translator regressions#152

Open
madhakish wants to merge 1 commit into
large-yolo-refactorfrom
pr-151-review-fixes
Open

fix: PR #151 review findings — thread-safety, fail-closed availability, translator regressions#152
madhakish wants to merge 1 commit into
large-yolo-refactorfrom
pr-151-review-fixes

Conversation

@madhakish

Copy link
Copy Markdown

Stacked on #151. Addresses the merge-blocking subset of review findings (15 critical + 9 important + test order-dependence + doc accuracy). 33 files, +397/−175.

Verification

  • bundle exec rubocop: 0 offenses (490 files)
  • bundle exec rspec: 3247 examples, 1 failure, 0 pending
    • Net vs base: +10 examples, −1 pending (un-skipped executor_spec.rb:576), 0 regressions
    • The 1 failure (ask_spec.rb:413try_defer not reached via chat_direct_governed) is pre-existing on large-yolo-refactor (verified via stash) and touches files outside this diff

Critical (C1–C15)

# Fix Files
C1 Availability.filter_resolutions / Candidates.select_candidates return [result, reasons/trace] instead of unsynchronized module ivars; threaded via chain.rejection_reasons router/{availability,candidates}.rb, router.rb, executor/escalation.rb
C2 HealthTracker.known_instances wrapped in Monitor router/health_tracker.rb
C3 rejection_reason fail-closed → :availability_check_error on exception router/availability.rb
C4 :success health report ungated from @resolved_offering_id (half-open circuits can close) executor/escalation.rb
C5 internal_error? guard — daemon NoMethodError/ArgumentError no longer trips provider circuits executor/escalation.rb
C6 Remove bare /ValidationException/ from payload patterns; bedrock_availability_error? → deny/trip executor.rb, executor/escalation.rb
C7 routing[:model] ||= body[:model] — SDK clients' model param honored client_translators/{anthropic_messages,openai_chat,openai_responses}.rb
C8 OpenAI Chat streaming tool_calls[].index = 0-based ordinal, not global block_index client_translators/openai_chat.rb
C9 OpenAI Responses streaming: track server_tool; only client tools → requires_action client_translators/openai_responses.rb
C10 Anthropic: preserve thinking_blocks on assistant parse for same-provider replay client_translators/anthropic_messages.rb
C11 Anthropic: preserve image/document as content_blocks instead of Hash#to_s client_translators/anthropic_messages.rb
C12 parse_arguments: log non-Hash JSON + truncated raw on parse failure call/dispatch.rb
C13 safe_call: narrow rescue ExceptionNoMethodError/StandardError + RSpec guard api/stream_assembler.rb
C14 step_tier_assignment: re-raise (fail-closed for privacy/security tier constraints) executor/routing.rb
C15 merge_defaults: strip-and-warn deprecated :capability from settings default_intent (still raises on caller intent) router.rb

Important (I1–I3, I6, I7, I12, I13, I16–I18)

  • I1 Capabilities.normalize: canonical-replace aliases ([:tools] now satisfies required: [:function_calling])
  • I2 RoutingUnavailable/TooEarly retryable? = true; FailedDependency explicit false; all carry reasons:
  • I3 routing_empty_chain_error(reasons): maps actual Availability symbols (:discovery_unavailable → 425, :circuit_open/etc → 424)
  • I6 resolve_chain forwards estimated_tokens/required_capabilities to chain_from_defaults
  • I7 normalize_model_for_rules copies :loaded so loaded_model_bonus actually fires
  • I12 handle_text_delta closes open thinking block before opening text
  • I13 Puma::ConnectionError added to client-disconnect rescues (assembler + 3 routes)
  • I16 0.90 context headroom unified → Availability.context_headroom (3 sites)
  • I17 handle_exception on bare rescues (tools/dispatcher, tools/special, debug_formats)
  • I18 Capabilities gets extend Legion::Logging::Helper

Tests

  • Registry teardown in executor_escalation_circuit_spec; FakeProvider.reset! clears :fake_server_tool_round
  • execution_proxy_matrix_spec: singleton_class.prependwrap_original (no permanent mutation)
  • executor_spec.rb:576 un-skipped (root cause was Registry leak, fixed above)
  • Spec updates for return-tuple signatures, canonical capability aliases, retryable contracts, C7/C15 behavior

Docs

  • candidates.rb/routing.rb/escalation.rb: drop false 'verbatim/no behavior change' claims; note 0.12.20/0.12.30/0.12.35 extensions
  • StreamAssembler + CLAUDE.md Invariant 5: keep-alive is single-fire on tool-open, not periodic
  • Remove dangling refs (P4b/P5/G23/A7-fixture); candidates.rb:235 orphan fragment; CLAUDE.md example-count/relative-date

Deferred to follow-up (needs @Esity input)

# Item Why deferred
I4/I5 skip_same_tier! vs max_attempts cap; chain.size vs effective attempts Changes escalation iteration semantics — author should decide
I8 Dispatch instance-fallback attribution Behavior decision: fail vs fallback-with-correct-attribution
I9/I10/I11 Responses streaming reasoning shape; instructionssystem; unresolved server-tool leak G24 contract decisions
I14 Restore **_opts on Router.resolve Intentional API tightening?
I15 Six false→true setting flips Confirm intentional or split out
StreamAssembler (706L, 1 example) + ContextWindow unit suites Test debt, separate PR
Fail-open → fail-closed sweep on remaining availability predicates Policy decision
ask_spec.rb:413 pre-existing failure Outside this diff

Merge order

Either: (a) merge this into large-yolo-refactor then merge #151, or (b) merge #151 first then retarget this to main (same base commit 4aa4240).

Note: #151 is currently CODEOWNERS-deadlocked — * @Esity is the effective last-match rule and Esity is the author. Needs admin merge or CODEOWNERS fix.

…y, translator regressions

Critical fixes (C1–C15):
- Router::Availability/Candidates: return [result, reasons/trace] instead of unsynchronized module ivars (C1)
- HealthTracker.known_instances: wrap hash key reads in Monitor (C2)
- Availability.rejection_reason: fail-closed → :availability_check_error on exception (C3)
- Escalation: ungate :success health report from @resolved_offering_id so half-open circuits can close (C4)
- Escalation: internal_error? guard — daemon NoMethodError/ArgumentError no longer trips provider circuits (C5)
- Escalation: narrow ValidationException; bedrock_availability_error? routes model-not-enabled to deny/trip (C6)
- Translators: routing[:model] ||= body[:model] — SDK clients' model param honored again (C7)
- OpenAI Chat Events: 0-based per-stream tool ordinal, not global block_index (C8)
- OpenAI Responses Events: track server_tool; only client tools yield requires_action (C9)
- Anthropic: preserve thinking_blocks on assistant parse for same-provider replay (C10)
- Anthropic: preserve image/document content_blocks instead of Hash#to_s (C11)
- Dispatch.parse_arguments: log non-Hash JSON + raw args on parse failure (C12)
- StreamAssembler.safe_call: narrow rescue Exception → NoMethodError/StandardError + RSpec guard (C13)
- Executor::Routing.step_tier_assignment: re-raise (fail-closed for privacy constraints) (C14)
- Router.merge_defaults: strip-and-warn deprecated :capability from settings default_intent (C15)

Important fixes (I1–I3, I6, I7, I12, I13, I16–I18):
- Capabilities.normalize: canonical-replace aliases (fixes :tools ⊅ :function_calling false negative)
- RoutingUnavailable/TooEarly retryable? = true; FailedDependency explicit false; reasons: kwarg
- routing_empty_chain_error: map actual Availability reason symbols; thread reasons via chain.rejection_reasons
- resolve_chain: forward estimated_tokens/required_capabilities to chain_from_defaults
- Discovery.normalize_model_for_rules: copy :loaded so loaded_model_bonus actually fires
- StreamAssembler.handle_text_delta: close open thinking block before opening text
- Puma::ConnectionError added to client-disconnect rescues (assembler + 3 routes)
- Unify 0.90 context headroom → Availability.context_headroom (3 sites)
- handle_exception on bare rescues in tools/{dispatcher,special} + debug_formats
- Capabilities: extend Legion::Logging::Helper

Tests:
- Registry teardown in escalation specs; FakeProvider.reset! clears :fake_server_tool_round
- execution_proxy_matrix: prepend → wrap_original (no permanent singleton mutation)
- executor_spec:576 un-skipped (root cause was Registry leak)
- availability/capabilities/errors/routing_headers/determinism specs updated for signature changes

Docs:
- candidates.rb / routing.rb / escalation.rb: drop false 'verbatim/no behavior change' claims
- StreamAssembler + CLAUDE.md Invariant 5: keep-alive is single-fire, not periodic
- Drop dangling refs to non-existent docs (P4b/P5/G23/A7-fixture)
- candidates.rb:235 orphaned comment fragment removed

Deferred (needs author input — see PR #151 review):
- I4/I5 escalation chain-window semantics (skip_same_tier vs max_attempts cap)
- I8 instance-fallback attribution, I9/I10/I11 OpenAI Responses streaming reasoning shape
- I14 **_opts restore, I15 settings default flips
- StreamAssembler/ContextWindow unit test suites

Pre-existing failure on base (not introduced here, not in diff):
  spec/legion/llm/ask_spec.rb:413 — try_defer not reached via chat_direct_governed path
@madhakish madhakish requested a review from a team as a code owner June 17, 2026 20:30
Copilot AI review requested due to automatic review settings June 17, 2026 20:30

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR applies the merge-blocking review fixes stacked on #151, focusing on thread-safety in routing/availability, fail-closed availability behavior, and execution-proxy translator streaming regressions across OpenAI/Anthropic client dialects.

Changes:

  • Remove thread-unsafe module ivars by threading candidate/availability traces and rejection reasons through routing chains, and add locking to health tracking.
  • Harden routing/executor behavior (fail-closed tier assignment, typed routing errors with retryable contracts, refined circuit/deny/trip classification).
  • Fix/adjust client translators + stream assembly behavior (model fallback, tool streaming semantics, thinking/modality preservation, disconnect handling) with corresponding spec and doc updates.

Reviewed changes

Copilot reviewed 33 out of 33 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
spec/support/fake_provider.rb Reset additional thread-local fake-provider state to avoid spec leakage.
spec/legion/llm/router/multi_instance_spec.rb Update stubs for new filter_resolutions return tuple.
spec/legion/llm/router/determinism_spec.rb Update behavior/expectations for deprecated :capability default_intent handling and tuple returns.
spec/legion/llm/router/availability_spec.rb Adapt specs to [filtered, reasons] return and add fail-closed availability-check error coverage.
spec/legion/llm/inference/executor_spec.rb Re-enable previously skipped test and ensure provider teardown.
spec/legion/llm/inference/executor_escalation_circuit_spec.rb Add registry teardown and refine empty-chain typed error expectations.
spec/legion/llm/errors_spec.rb Add coverage for retryable and reasons:-carrying routing errors.
spec/legion/llm/capabilities_spec.rb Update capability alias normalization semantics and add alias/canonical matching test.
spec/legion/llm/api/matrix/execution_proxy_matrix_spec.rb Replace singleton prepend mutation with and_wrap_original to prevent order dependence.
spec/legion/llm/api/client_translators/routing_headers_spec.rb Align routing/model semantics: body model captured and used as routing fallback when header absent.
lib/legion/llm/tools/special.rb Add handle_exception on settings access rescue to avoid silent failures.
lib/legion/llm/tools/dispatcher.rb Same as above for tool dispatcher settings access.
lib/legion/llm/router/health_tracker.rb Synchronize known_instances access to circuit/latency keys.
lib/legion/llm/router/candidates.rb Return [final, trace] instead of using shared ivars; unify context headroom constant source.
lib/legion/llm/router/availability.rb Return [filtered, reasons], fail-closed on unexpected exceptions, adjust logging.
lib/legion/llm/router.rb Thread rejection reasons through escalation chains; strip deprecated :capability from defaults with warning; forward caps/tokens to defaults chain-building.
lib/legion/llm/inference/route_attempts.rb Use shared context headroom value for final context budget enforcement.
lib/legion/llm/inference/executor/routing.rb Fail-closed tier assignment by re-raising on assigner errors.
lib/legion/llm/inference/executor/escalation.rb Typed routing error selection using rejection reasons; refine internal/bedrock-availability classification; ensure success health reports emit.
lib/legion/llm/inference/executor.rb Remove overly-broad ValidationException payload classification.
lib/legion/llm/errors.rb Add reasons: plumbing and retryable semantics to routing errors.
lib/legion/llm/discovery.rb Preserve loaded metadata for rule scoring.
lib/legion/llm/capabilities.rb Canonicalize capability aliases (replace rather than add) and add logging helper.
lib/legion/llm/call/dispatch.rb Improve tool-argument parsing diagnostics and truncated raw logging on parse failure.
lib/legion/llm/api/stream_assembler.rb Add disconnect rescue for Puma, close thinking before text, safer chunk probing, updated keep-alive semantics.
lib/legion/llm/api/namespaces/openai/responses.rb Treat Puma disconnect as client cancellation in SSE route.
lib/legion/llm/api/namespaces/openai/chat/completions.rb Treat Puma disconnect as client cancellation in SSE route.
lib/legion/llm/api/namespaces/anthropic/messages.rb Treat Puma disconnect as client cancellation in SSE route.
lib/legion/llm/api/debug_formats.rb Handle write failures via handle_exception for echo-request SSE emission.
lib/legion/llm/api/client_translators/openai_responses.rb Route model fallback from body when header absent; track server_tool and only surface actionable tools as requires_action.
lib/legion/llm/api/client_translators/openai_chat.rb Route model fallback from body when header absent; adjust streaming tool-call indexing logic.
lib/legion/llm/api/client_translators/anthropic_messages.rb Route model fallback from body when header absent; preserve thinking blocks and structured content blocks for replay.
CLAUDE.md Update doc accuracy around matrix harness and keep-alive semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 279 to 287
def on_tool_call_open(block_index:, tool_call:, server_tool:)
_ = server_tool
# Send the initial tool_call with name. Index is 0-based across
# this streaming response; chat.completion.chunk references it
# by index in the delta.
# Send the initial tool_call with name. Index is the 0-based
# tool-call ordinal across this streaming response (NOT the
# assembler's global block index); chat.completion.chunk
# references it by index in the delta.
idx = (@tool_ordinal += 1)
@tool_ordinals[block_index] = idx
emit_chunk(delta_envelope({
Comment on lines 300 to 309
def on_tool_call_delta(block_index:, partial_arguments_json:)
return if partial_arguments_json.to_s.empty?

idx = @tool_ordinals[block_index]
emit_chunk(delta_envelope({
tool_calls: [{
index: block_index,
index: idx,
function: { arguments: partial_arguments_json }
}]
}))
@Esity Esity changed the base branch from large-yolo-refactor to main June 17, 2026 20:36
@Esity Esity self-assigned this Jun 17, 2026
@Esity Esity self-requested a review June 17, 2026 20:37

@Esity Esity left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • fix the merge conflicts
  • you need to bump the version.rb to 0.13.1
  • Add a v0.13.1 CHANGELOG.md entry to match

@Esity Esity added bug Something isn't working area:extensions Extension system / LEX area:llm LLM integration / AI providers labels Jun 17, 2026
@Esity Esity changed the base branch from main to large-yolo-refactor June 18, 2026 16:10
Esity added a commit that referenced this pull request Jun 19, 2026
…hausted/InvalidHeader

Adds the stateless Router.request_lane(**routing_payload) alongside existing
Router.resolve/resolve_chain. Pure function: hard filters → drop tried_lanes
and lane_weight ≤ 0 → group_by(weight) → max bucket → .sample(rng).

Capability normalization via Inventory::Capabilities.normalize on both sides
so :tools/:function_calling/:tool_use are aliases (PR #152 I1 / H-C).

M1 fast path: single provider/instance narrows to Inventory.lanes_for index.
B-E guard: group_by { |lane| lane[:lane_weight] }, never &:lane_weight (Hash).

Adds to errors.rb:
  - Legion::LLM::Errors::NoLaneAvailable (filters:, retryable? = false)
  - Legion::LLM::Errors::EscalationExhausted (attempts:, tried_lanes:, last_error:)
  - Legion::LLM::Errors::InvalidHeader (header:, got:, valid:)

.rubocop.yml: CountKeywordArgs: false for Metrics/ParameterLists so the all-
kwargs method signature is idiomatic; removes redundant disable directives.
Esity added a commit that referenced this pull request Jun 19, 2026
Fills spec/legion/llm/router/request_lane_spec.rb with 40 examples covering:
  - Hard filters: type, tiers, providers, instances, models, capabilities,
    thinking (:require/:forbid/:any), privacy: :strict, estimated_context
  - Capability normalization: :tools ↔ :function_calling ↔ :tool_use (PR #152 I1)
    Fixed: canonicalize_capabilities() collapses all aliases to canonical form
    for set-difference; bidirectional matching now works correctly.
  - tried_lanes exclusion: by id, fallback to lower-weight bucket
  - lane_weight ≤ 0 filtering: open-circuit, denied, half_open (still eligible)
  - Bucket selection: max-weight bucket wins, seeded RNG determinism, spread
  - Exhaustion: nil when filtered, nil when all tried, nil when empty
  - No hail-mary (G24): nil even when default lane exists + last_resort_default
    setting has no effect
  - Bucket-G25 determinism: same seed = same lane, different seeds spread
  - M1 optimization: lanes_for used for single-provider filter

Bug fix: lane_passes_hard_filters? now uses canonicalize_capabilities() which
collapses all capability aliases to their canonical form before comparing.
Previously normalize([:tool_use]) produced [:tool_use, :tools] — the non-
canonical :tool_use was never in the lane's normalized set, causing false
rejections for request :tool_use → lane :tools.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:extensions Extension system / LEX area:llm LLM integration / AI providers bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants