Engine: warn on prose-shaped condition strings + accept procedure-pathway regimens#588
Merged
Conversation
Algorithm decision trees contain 376 of 443 `condition:` strings (85%)
written as English prose ("ECOG PS 0-2", "BRCA1 or BRCA2 pathogenic").
`_eval_clause` only resolves flat finding keys, so these silently return
False — and in ~27% of audited algorithms, step-1 is entirely prose and
the tree falls through to `default_indication` on every patient.
Routing semantics unchanged. Added a one-time per-unique-string WARNING
when a `condition:` looks like prose AND the lookup missed. Flat ID-shape
keys (BIO-HER2, hcv_status, ECOG_PS) are NOT flagged, so existing
finding-key clauses keep working silently.
Detector heuristic: comparison operators, ` or ` / ` and ` connectives,
parens/commas, "space + lowercase" or "ALLCAPS space ALLCAPS" word
boundaries. Per-string dedup via module-level set with a test-only
reset hook.
Audit doc `docs/reviews/openonco-state-audit-2026-05-17.md` documents
scope, methodology, roadmap staleness (3 items already done on master),
and the orthogonal "structured condition AST" path forward (out of scope
here — Big-P3 workstream gated by clinical co-lead review per CHARTER
§6.1).
Tests: 6 new in test_prose_condition_warning.py — operator clause warns,
boolean-connective clause warns, ALL-CAPS flat key does NOT warn, real
finding lookup does NOT warn, dedup fires once per unique string,
structured threshold/value clauses unaffected. 909 adjacent engine tests
still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Initial figure was 8/30 (27%) — sampled the first 30 algorithm files. Full sweep over all 152 algorithms with a decision_tree shows 45/152 (30%) where step-1 evaluates entirely prose clauses and the tree falls through to default_indication on every patient. Also flagged the warning's scope: condition: clauses only; finding: with a prose value is deliberate-author shape and is not flagged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`test_no_regression_all_244_legacy_yamls_load` failed on `reg_allohct_jmml.yaml`: 3 explicit phases, all with `components: []`, top-level `components: []`. Existing invariant required >=1 phase drug component on every explicit-phase regimen. This is the procedure-pathway shape: the "treatment" is the stem-cell product (or surgical procedure), documented via phase purpose strings, not DRUG-* refs. The other 4 allohct regimens currently hack around the invariant by parking a placeholder drug (e.g. DRUG-CYTARABINE) inside conditioning — that's not what the regimen actually is. Relaxed: when explicit phases have 0 drug components, accept iff top-level components is also 0. If top-level has drugs but phases don't, that's still the "author moved structure but left drugs at top level" bug and the test still fails for it. Engine downstream doesn't depend on phase_drugs > 0 (grep `knowledge_base/engine/*.phases` — render.py reference is to MonitoringSchedule.phases, a different concept). No clinical content edits. 14/14 phase tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three small, orthogonal changes from an analysis-and-improvement session on
claude/angry-darwin-31b5d0. Each commit stands alone and ships behind tests; routing semantics are unchanged.1.
feat(engine): warn on prose-shapedcondition:strings (806361ef)Algorithm decision trees contain 376 of 443
condition:strings (85%) written as English prose ("ECOG PS 0-2","BRCA1 or BRCA2 pathogenic")._eval_clauseonly resolves flat finding keys, so these silently return False — and in 45 of 152 algorithms (30%), step-1 is entirely prose and the tree falls through todefault_indicationon every patient.This PR adds a one-time-per-unique-string
logging.WARNINGwhen acondition:looks like prose AND the lookup missed. Flat ID-shape keys (BIO-HER2,hcv_status,ECOG_PS) are NOT flagged, so existing finding-key clauses keep working silently.Detector heuristic: comparison operators,
or/andconnectives, parens/commas, "space + lowercase" or "ALLCAPS space ALLCAPS" word boundaries. Per-string dedup via module-level set with a test-only reset hook.6 new tests in
tests/test_prose_condition_warning.py— operator clause warns, boolean-connective clause warns, ALL-CAPS flat key does NOT warn, real finding lookup does NOT warn, dedup fires once per unique string, structured threshold/value clauses unaffected.2.
docs(reviews): state-audit doc + tighten fallthrough count (in 806361e + 8cf8a1b)docs/reviews/openonco-state-audit-2026-05-17.mdcovers:The "structured condition AST" path (recommendation 5) is explicitly out of scope here — Big-P3 workstream gated by clinical co-lead review per CHARTER §6.1.
3.
test(regimens): accept procedure-pathway shape (2c9003ad)test_no_regression_all_244_legacy_yamls_loadwas failing onreg_allohct_jmml.yaml: 3 explicit phases, allcomponents: [], top-levelcomponents: []. Existing invariant requiredphase_drugs > 0on every explicit-phase regimen.This is the procedure-pathway shape: the "treatment" is the stem-cell product (or surgical procedure), documented via phase purpose strings, not
DRUG-*refs. The other 4 allohct regimens currently hack around the invariant by parking a placeholder drug inside conditioning — that's not what the regimen actually is.Relaxed: when explicit phases have 0 drug components, accept iff top-level
componentsis also 0. If top-level has drugs but phases don't, that's still the "author moved structure but left drugs at top level" bug and the test still fails for it.Engine downstream doesn't depend on
phase_drugs > 0(render.py:1694reference is toMonitoringSchedule.phases, a different concept). No clinical content edits.What's NOT in this PR (deliberately, per CLAUDE.md scope rules)
Test plan
pytest tests/test_prose_condition_warning.py— 6/6 passpytest tests/test_regimen_phases.py— 14/14 pass (previously 13/14)test_engine,test_redflag_*,test_actionability_*,test_algorithm_regimen_routing_contracts,test_bcc_engine,test_burkitt_engine) — all passknowledge_base/hosted/content/(clinical content)git add -A/--no-verify— explicit pathspecs only🤖 Generated with Claude Code