Bug Description
In skills/skill-comply/scripts/grader.py, the _check_temporal_order function checks after_step constraints by looking only in the resolved dict (steps that have already been matched), but not in the classified dict (all LLM-classified events).
Root Cause
def _check_temporal_order(step, event, resolved, classified):
if step.detector.after_step is not None:
after_events = resolved.get(step.detector.after_step, [])
if not after_events:
return f"after_step '{step.detector.after_step}' not yet detected"
The resolved dict is populated sequentially as each step in spec.steps is processed. If an after_step reference points to a step that appears later in the spec ordering, resolved will not contain it yet, causing the check to always return a failure — even when the events in the trace are correctly ordered.
By contrast, the before_step check correctly falls back to the classified dict:
if step.detector.before_step is not None:
before_events = resolved.get(step.detector.before_step)
if before_events is None:
before_events = classified.get(step.detector.before_step, []) # ← correct fallback
Impact
Any compliance spec where an after_step constraint references a step that is listed after the constrained step in spec.steps will produce incorrect grading results — steps will be marked as failing even when the trace events satisfy the temporal ordering.
Fix
Apply the same classified fallback used by before_step to after_step:
if step.detector.after_step is not None:
after_events = resolved.get(step.detector.after_step)
if after_events is None:
after_events = classified.get(step.detector.after_step, [])
if not after_events:
return f"after_step '{step.detector.after_step}' not yet detected"
Files Affected
skills/skill-comply/scripts/grader.py
Bug Description
In
skills/skill-comply/scripts/grader.py, the_check_temporal_orderfunction checksafter_stepconstraints by looking only in theresolveddict (steps that have already been matched), but not in theclassifieddict (all LLM-classified events).Root Cause
The
resolveddict is populated sequentially as each step inspec.stepsis processed. If anafter_stepreference points to a step that appears later in the spec ordering,resolvedwill not contain it yet, causing the check to always return a failure — even when the events in the trace are correctly ordered.By contrast, the
before_stepcheck correctly falls back to theclassifieddict:Impact
Any compliance spec where an
after_stepconstraint references a step that is listed after the constrained step inspec.stepswill produce incorrect grading results — steps will be marked as failing even when the trace events satisfy the temporal ordering.Fix
Apply the same
classifiedfallback used bybefore_steptoafter_step:Files Affected
skills/skill-comply/scripts/grader.py