Skip to content

fix(grader): after_step constraint always fails when referencing a later spec step #1338

@kuishou68

Description

@kuishou68

Bug Description

In skills/skill-comply/scripts/grader.py, the _check_temporal_order function checks after_step constraints by looking only in the resolved dict (steps that have already been matched), but not in the classified dict (all LLM-classified events).

Root Cause

def _check_temporal_order(step, event, resolved, classified):
    if step.detector.after_step is not None:
        after_events = resolved.get(step.detector.after_step, [])
        if not after_events:
            return f"after_step '{step.detector.after_step}' not yet detected"

The resolved dict is populated sequentially as each step in spec.steps is processed. If an after_step reference points to a step that appears later in the spec ordering, resolved will not contain it yet, causing the check to always return a failure — even when the events in the trace are correctly ordered.

By contrast, the before_step check correctly falls back to the classified dict:

if step.detector.before_step is not None:
    before_events = resolved.get(step.detector.before_step)
    if before_events is None:
        before_events = classified.get(step.detector.before_step, [])   # ← correct fallback

Impact

Any compliance spec where an after_step constraint references a step that is listed after the constrained step in spec.steps will produce incorrect grading results — steps will be marked as failing even when the trace events satisfy the temporal ordering.

Fix

Apply the same classified fallback used by before_step to after_step:

if step.detector.after_step is not None:
    after_events = resolved.get(step.detector.after_step)
    if after_events is None:
        after_events = classified.get(step.detector.after_step, [])
    if not after_events:
        return f"after_step '{step.detector.after_step}' not yet detected"

Files Affected

  • skills/skill-comply/scripts/grader.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions