fix: PR #151 review findings — thread-safety, fail-closed availability, translator regressions#152
Open
madhakish wants to merge 1 commit into
Open
fix: PR #151 review findings — thread-safety, fail-closed availability, translator regressions#152madhakish wants to merge 1 commit into
madhakish wants to merge 1 commit into
Conversation
…y, translator regressions
Critical fixes (C1–C15):
- Router::Availability/Candidates: return [result, reasons/trace] instead of unsynchronized module ivars (C1)
- HealthTracker.known_instances: wrap hash key reads in Monitor (C2)
- Availability.rejection_reason: fail-closed → :availability_check_error on exception (C3)
- Escalation: ungate :success health report from @resolved_offering_id so half-open circuits can close (C4)
- Escalation: internal_error? guard — daemon NoMethodError/ArgumentError no longer trips provider circuits (C5)
- Escalation: narrow ValidationException; bedrock_availability_error? routes model-not-enabled to deny/trip (C6)
- Translators: routing[:model] ||= body[:model] — SDK clients' model param honored again (C7)
- OpenAI Chat Events: 0-based per-stream tool ordinal, not global block_index (C8)
- OpenAI Responses Events: track server_tool; only client tools yield requires_action (C9)
- Anthropic: preserve thinking_blocks on assistant parse for same-provider replay (C10)
- Anthropic: preserve image/document content_blocks instead of Hash#to_s (C11)
- Dispatch.parse_arguments: log non-Hash JSON + raw args on parse failure (C12)
- StreamAssembler.safe_call: narrow rescue Exception → NoMethodError/StandardError + RSpec guard (C13)
- Executor::Routing.step_tier_assignment: re-raise (fail-closed for privacy constraints) (C14)
- Router.merge_defaults: strip-and-warn deprecated :capability from settings default_intent (C15)
Important fixes (I1–I3, I6, I7, I12, I13, I16–I18):
- Capabilities.normalize: canonical-replace aliases (fixes :tools ⊅ :function_calling false negative)
- RoutingUnavailable/TooEarly retryable? = true; FailedDependency explicit false; reasons: kwarg
- routing_empty_chain_error: map actual Availability reason symbols; thread reasons via chain.rejection_reasons
- resolve_chain: forward estimated_tokens/required_capabilities to chain_from_defaults
- Discovery.normalize_model_for_rules: copy :loaded so loaded_model_bonus actually fires
- StreamAssembler.handle_text_delta: close open thinking block before opening text
- Puma::ConnectionError added to client-disconnect rescues (assembler + 3 routes)
- Unify 0.90 context headroom → Availability.context_headroom (3 sites)
- handle_exception on bare rescues in tools/{dispatcher,special} + debug_formats
- Capabilities: extend Legion::Logging::Helper
Tests:
- Registry teardown in escalation specs; FakeProvider.reset! clears :fake_server_tool_round
- execution_proxy_matrix: prepend → wrap_original (no permanent singleton mutation)
- executor_spec:576 un-skipped (root cause was Registry leak)
- availability/capabilities/errors/routing_headers/determinism specs updated for signature changes
Docs:
- candidates.rb / routing.rb / escalation.rb: drop false 'verbatim/no behavior change' claims
- StreamAssembler + CLAUDE.md Invariant 5: keep-alive is single-fire, not periodic
- Drop dangling refs to non-existent docs (P4b/P5/G23/A7-fixture)
- candidates.rb:235 orphaned comment fragment removed
Deferred (needs author input — see PR #151 review):
- I4/I5 escalation chain-window semantics (skip_same_tier vs max_attempts cap)
- I8 instance-fallback attribution, I9/I10/I11 OpenAI Responses streaming reasoning shape
- I14 **_opts restore, I15 settings default flips
- StreamAssembler/ContextWindow unit test suites
Pre-existing failure on base (not introduced here, not in diff):
spec/legion/llm/ask_spec.rb:413 — try_defer not reached via chat_direct_governed path
Contributor
There was a problem hiding this comment.
Pull request overview
This PR applies the merge-blocking review fixes stacked on #151, focusing on thread-safety in routing/availability, fail-closed availability behavior, and execution-proxy translator streaming regressions across OpenAI/Anthropic client dialects.
Changes:
- Remove thread-unsafe module ivars by threading candidate/availability traces and rejection reasons through routing chains, and add locking to health tracking.
- Harden routing/executor behavior (fail-closed tier assignment, typed routing errors with retryable contracts, refined circuit/deny/trip classification).
- Fix/adjust client translators + stream assembly behavior (model fallback, tool streaming semantics, thinking/modality preservation, disconnect handling) with corresponding spec and doc updates.
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| spec/support/fake_provider.rb | Reset additional thread-local fake-provider state to avoid spec leakage. |
| spec/legion/llm/router/multi_instance_spec.rb | Update stubs for new filter_resolutions return tuple. |
| spec/legion/llm/router/determinism_spec.rb | Update behavior/expectations for deprecated :capability default_intent handling and tuple returns. |
| spec/legion/llm/router/availability_spec.rb | Adapt specs to [filtered, reasons] return and add fail-closed availability-check error coverage. |
| spec/legion/llm/inference/executor_spec.rb | Re-enable previously skipped test and ensure provider teardown. |
| spec/legion/llm/inference/executor_escalation_circuit_spec.rb | Add registry teardown and refine empty-chain typed error expectations. |
| spec/legion/llm/errors_spec.rb | Add coverage for retryable and reasons:-carrying routing errors. |
| spec/legion/llm/capabilities_spec.rb | Update capability alias normalization semantics and add alias/canonical matching test. |
| spec/legion/llm/api/matrix/execution_proxy_matrix_spec.rb | Replace singleton prepend mutation with and_wrap_original to prevent order dependence. |
| spec/legion/llm/api/client_translators/routing_headers_spec.rb | Align routing/model semantics: body model captured and used as routing fallback when header absent. |
| lib/legion/llm/tools/special.rb | Add handle_exception on settings access rescue to avoid silent failures. |
| lib/legion/llm/tools/dispatcher.rb | Same as above for tool dispatcher settings access. |
| lib/legion/llm/router/health_tracker.rb | Synchronize known_instances access to circuit/latency keys. |
| lib/legion/llm/router/candidates.rb | Return [final, trace] instead of using shared ivars; unify context headroom constant source. |
| lib/legion/llm/router/availability.rb | Return [filtered, reasons], fail-closed on unexpected exceptions, adjust logging. |
| lib/legion/llm/router.rb | Thread rejection reasons through escalation chains; strip deprecated :capability from defaults with warning; forward caps/tokens to defaults chain-building. |
| lib/legion/llm/inference/route_attempts.rb | Use shared context headroom value for final context budget enforcement. |
| lib/legion/llm/inference/executor/routing.rb | Fail-closed tier assignment by re-raising on assigner errors. |
| lib/legion/llm/inference/executor/escalation.rb | Typed routing error selection using rejection reasons; refine internal/bedrock-availability classification; ensure success health reports emit. |
| lib/legion/llm/inference/executor.rb | Remove overly-broad ValidationException payload classification. |
| lib/legion/llm/errors.rb | Add reasons: plumbing and retryable semantics to routing errors. |
| lib/legion/llm/discovery.rb | Preserve loaded metadata for rule scoring. |
| lib/legion/llm/capabilities.rb | Canonicalize capability aliases (replace rather than add) and add logging helper. |
| lib/legion/llm/call/dispatch.rb | Improve tool-argument parsing diagnostics and truncated raw logging on parse failure. |
| lib/legion/llm/api/stream_assembler.rb | Add disconnect rescue for Puma, close thinking before text, safer chunk probing, updated keep-alive semantics. |
| lib/legion/llm/api/namespaces/openai/responses.rb | Treat Puma disconnect as client cancellation in SSE route. |
| lib/legion/llm/api/namespaces/openai/chat/completions.rb | Treat Puma disconnect as client cancellation in SSE route. |
| lib/legion/llm/api/namespaces/anthropic/messages.rb | Treat Puma disconnect as client cancellation in SSE route. |
| lib/legion/llm/api/debug_formats.rb | Handle write failures via handle_exception for echo-request SSE emission. |
| lib/legion/llm/api/client_translators/openai_responses.rb | Route model fallback from body when header absent; track server_tool and only surface actionable tools as requires_action. |
| lib/legion/llm/api/client_translators/openai_chat.rb | Route model fallback from body when header absent; adjust streaming tool-call indexing logic. |
| lib/legion/llm/api/client_translators/anthropic_messages.rb | Route model fallback from body when header absent; preserve thinking blocks and structured content blocks for replay. |
| CLAUDE.md | Update doc accuracy around matrix harness and keep-alive semantics. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
279
to
287
| def on_tool_call_open(block_index:, tool_call:, server_tool:) | ||
| _ = server_tool | ||
| # Send the initial tool_call with name. Index is 0-based across | ||
| # this streaming response; chat.completion.chunk references it | ||
| # by index in the delta. | ||
| # Send the initial tool_call with name. Index is the 0-based | ||
| # tool-call ordinal across this streaming response (NOT the | ||
| # assembler's global block index); chat.completion.chunk | ||
| # references it by index in the delta. | ||
| idx = (@tool_ordinal += 1) | ||
| @tool_ordinals[block_index] = idx | ||
| emit_chunk(delta_envelope({ |
Comment on lines
300
to
309
| def on_tool_call_delta(block_index:, partial_arguments_json:) | ||
| return if partial_arguments_json.to_s.empty? | ||
|
|
||
| idx = @tool_ordinals[block_index] | ||
| emit_chunk(delta_envelope({ | ||
| tool_calls: [{ | ||
| index: block_index, | ||
| index: idx, | ||
| function: { arguments: partial_arguments_json } | ||
| }] | ||
| })) |
Esity
requested changes
Jun 17, 2026
Esity
left a comment
Contributor
There was a problem hiding this comment.
- fix the merge conflicts
- you need to bump the version.rb to 0.13.1
- Add a v0.13.1 CHANGELOG.md entry to match
Esity
added a commit
that referenced
this pull request
Jun 19, 2026
…hausted/InvalidHeader Adds the stateless Router.request_lane(**routing_payload) alongside existing Router.resolve/resolve_chain. Pure function: hard filters → drop tried_lanes and lane_weight ≤ 0 → group_by(weight) → max bucket → .sample(rng). Capability normalization via Inventory::Capabilities.normalize on both sides so :tools/:function_calling/:tool_use are aliases (PR #152 I1 / H-C). M1 fast path: single provider/instance narrows to Inventory.lanes_for index. B-E guard: group_by { |lane| lane[:lane_weight] }, never &:lane_weight (Hash). Adds to errors.rb: - Legion::LLM::Errors::NoLaneAvailable (filters:, retryable? = false) - Legion::LLM::Errors::EscalationExhausted (attempts:, tried_lanes:, last_error:) - Legion::LLM::Errors::InvalidHeader (header:, got:, valid:) .rubocop.yml: CountKeywordArgs: false for Metrics/ParameterLists so the all- kwargs method signature is idiomatic; removes redundant disable directives.
Esity
added a commit
that referenced
this pull request
Jun 19, 2026
Fills spec/legion/llm/router/request_lane_spec.rb with 40 examples covering:
- Hard filters: type, tiers, providers, instances, models, capabilities,
thinking (:require/:forbid/:any), privacy: :strict, estimated_context
- Capability normalization: :tools ↔ :function_calling ↔ :tool_use (PR #152 I1)
Fixed: canonicalize_capabilities() collapses all aliases to canonical form
for set-difference; bidirectional matching now works correctly.
- tried_lanes exclusion: by id, fallback to lower-weight bucket
- lane_weight ≤ 0 filtering: open-circuit, denied, half_open (still eligible)
- Bucket selection: max-weight bucket wins, seeded RNG determinism, spread
- Exhaustion: nil when filtered, nil when all tried, nil when empty
- No hail-mary (G24): nil even when default lane exists + last_resort_default
setting has no effect
- Bucket-G25 determinism: same seed = same lane, different seeds spread
- M1 optimization: lanes_for used for single-provider filter
Bug fix: lane_passes_hard_filters? now uses canonicalize_capabilities() which
collapses all capability aliases to their canonical form before comparing.
Previously normalize([:tool_use]) produced [:tool_use, :tools] — the non-
canonical :tool_use was never in the lane's normalized set, causing false
rejections for request :tool_use → lane :tools.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on #151. Addresses the merge-blocking subset of review findings (15 critical + 9 important + test order-dependence + doc accuracy). 33 files, +397/−175.
Verification
bundle exec rubocop: 0 offenses (490 files)bundle exec rspec: 3247 examples, 1 failure, 0 pendingexecutor_spec.rb:576), 0 regressionsask_spec.rb:413—try_defernot reached viachat_direct_governed) is pre-existing onlarge-yolo-refactor(verified via stash) and touches files outside this diffCritical (C1–C15)
Availability.filter_resolutions/Candidates.select_candidatesreturn[result, reasons/trace]instead of unsynchronized module ivars; threaded viachain.rejection_reasonsHealthTracker.known_instanceswrapped in Monitorrejection_reasonfail-closed →:availability_check_erroron exception:successhealth report ungated from@resolved_offering_id(half-open circuits can close)internal_error?guard — daemonNoMethodError/ArgumentErrorno longer trips provider circuits/ValidationException/from payload patterns;bedrock_availability_error?→ deny/triprouting[:model] ||= body[:model]— SDK clients' model param honoredtool_calls[].index= 0-based ordinal, not global block_indexserver_tool; only client tools →requires_actionthinking_blockson assistant parse for same-provider replayimage/documentascontent_blocksinstead ofHash#to_sparse_arguments: log non-Hash JSON + truncated raw on parse failuresafe_call: narrowrescue Exception→NoMethodError/StandardError+ RSpec guardstep_tier_assignment: re-raise (fail-closed for privacy/security tier constraints)merge_defaults: strip-and-warn deprecated:capabilityfrom settingsdefault_intent(still raises on caller intent)Important (I1–I3, I6, I7, I12, I13, I16–I18)
Capabilities.normalize: canonical-replace aliases ([:tools]now satisfiesrequired: [:function_calling])RoutingUnavailable/TooEarlyretryable? = true;FailedDependencyexplicitfalse; all carryreasons:routing_empty_chain_error(reasons): maps actual Availability symbols (:discovery_unavailable→ 425,:circuit_open/etc → 424)resolve_chainforwardsestimated_tokens/required_capabilitiestochain_from_defaultsnormalize_model_for_rulescopies:loadedsoloaded_model_bonusactually fireshandle_text_deltacloses open thinking block before opening textPuma::ConnectionErroradded to client-disconnect rescues (assembler + 3 routes)0.90context headroom unified →Availability.context_headroom(3 sites)handle_exceptionon bare rescues (tools/dispatcher, tools/special, debug_formats)Capabilitiesgetsextend Legion::Logging::HelperTests
executor_escalation_circuit_spec;FakeProvider.reset!clears:fake_server_tool_roundexecution_proxy_matrix_spec:singleton_class.prepend→wrap_original(no permanent mutation)executor_spec.rb:576un-skipped (root cause was Registry leak, fixed above)Docs
candidates.rb/routing.rb/escalation.rb: drop false 'verbatim/no behavior change' claims; note 0.12.20/0.12.30/0.12.35 extensionscandidates.rb:235orphan fragment; CLAUDE.md example-count/relative-dateDeferred to follow-up (needs @Esity input)
skip_same_tier!vsmax_attemptscap;chain.sizevs effective attemptsDispatchinstance-fallback attributioninstructions→system; unresolved server-tool leak**_optsonRouter.resolvefalse→truesetting flipsStreamAssembler(706L, 1 example) +ContextWindowunit suitesask_spec.rb:413pre-existing failureMerge order
Either: (a) merge this into
large-yolo-refactorthen merge #151, or (b) merge #151 first then retarget this tomain(same base commit4aa4240).Note: #151 is currently CODEOWNERS-deadlocked —
* @Esityis the effective last-match rule and Esity is the author. Needs admin merge or CODEOWNERS fix.