test(e2e): run onboarding assertions in scenario runner by cv · Pull Request #4657 · NVIDIA/NemoClaw

cv · 2026-06-02T07:35:29Z

Summary

Runs declared YAML onboarding assertions from the shell scenario runner after setup/onboarding and before expected-state validation. Dry-run mode traces and reports the assertions without executing live assertion scripts.

Related Issue

Refs #3588.

Changes

Resolve onboarding_assertions from plan.json and nemoclaw_scenarios/scenarios.yaml inside runtime/run-scenario.sh.
Execute positive and negative preflight onboarding assertions at the correct point in the runner flow.
Add dry-run tracing and regression expectations for the baseline OpenClaw scenario.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
npm run docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Additional verification run:

bash test/e2e-scenario/runtime/run-scenario.sh ubuntu-repo-cloud-openclaw --dry-run
E2E_CONTEXT_DIR=$(mktemp -d) bash test/e2e-scenario/runtime/run-scenario.sh ubuntu-no-docker-preflight-negative --dry-run
npx vitest run --project e2e-scenario-framework test/e2e-scenario/framework-tests/e2e-scenario-first-migration.test.ts test/e2e-scenario/framework-tests/e2e-scenario-resolver.test.ts --silent=false --reporter=default
npx prek run --files test/e2e-scenario/runtime/run-scenario.sh test/e2e-scenario/framework-tests/e2e-scenario-first-migration.test.ts

Signed-off-by: Carlos Villela cvillela@nvidia.com

copy-pr-bot · 2026-06-02T07:35:33Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-06-02T07:35:36Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5ac1f7ab-9b24-4068-bed5-761f78bdb168

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch test/e2e-onboarding-assertions-runner

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-02T07:39:35Z

E2E Advisor Recommendation

Required E2E: e2e-scenarios/ubuntu-repo-cloud-openclaw, e2e-scenarios/ubuntu-no-docker-preflight-negative, e2e-scenarios/ubuntu-invalid-nvidia-key-negative
Optional E2E: e2e-scenarios/ubuntu-gateway-port-conflict-negative, e2e-scenarios/ubuntu-repo-cloud-hermes

Dispatch hint: ubuntu-repo-cloud-openclaw,ubuntu-no-docker-preflight-negative,ubuntu-invalid-nvidia-key-negative

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/ci/e2e-scenario-dry-run-summary
Head: HEAD
Confidence: medium

Required E2E

e2e-scenarios/ubuntu-repo-cloud-openclaw (medium): Primary canonical Ubuntu repo + Docker + cloud OpenClaw scenario. It is the scenario explicitly updated in the framework test and should verify onboarding assertion ordering around install, onboarding, gateway, sandbox, and dry-run expected-state validation.
e2e-scenarios/ubuntu-no-docker-preflight-negative (medium): Covers the preflight expected-failure path where run_onboarding_assertions is now called after forced Docker/preflight failure and before failure matching exits.
e2e-scenarios/ubuntu-invalid-nvidia-key-negative (medium): Covers an onboarding expected-failure scenario using the expected_failure branch now affected by onboarding assertion execution and side-effect checks.

Optional E2E

e2e-scenarios/ubuntu-gateway-port-conflict-negative (medium): Additional negative onboarding confidence for a different failure class and side-effect boundary around gateway startup/port conflicts.
e2e-scenarios/ubuntu-repo-cloud-hermes (medium): Useful adjacent coverage that the shared onboarding assertion hook does not regress the Hermes cloud onboarding scenario family.

New E2E recommendations

shell-scenario-runner-coverage (high): The existing scenario workflow dispatches test/e2e-scenario/scenarios/run.ts --dry-run, while this PR directly changes test/e2e-scenario/runtime/run-scenario.sh. Add a workflow-dispatched E2E job for the shell runner so future changes to run-scenario.sh are covered outside Vitest framework tests.
- Suggested test: Add an E2E workflow/job that runs: bash test/e2e-scenario/runtime/run-scenario.sh ubuntu-repo-cloud-openclaw --dry-run and bash test/e2e-scenario/runtime/run-scenario.sh ubuntu-no-docker-preflight-negative --dry-run, uploading E2E_CONTEXT_DIR artifacts on failure.
live-onboarding-assertions (medium): Dry-run validates ordering and trace output, but live assertion scripts can fail due to PATH, credentials, sandbox, or preflight state differences. A small live smoke would catch integration bugs in assertion script invocation.
- Suggested test: Add a live Ubuntu repo cloud OpenClaw shell-runner smoke that executes onboarding assertions before suite dispatch, gated on NVIDIA_API_KEY and using existing cleanup/artifact conventions.

Dispatch hint

Workflow: .github/workflows/e2e-scenarios.yaml
jobs input: ubuntu-repo-cloud-openclaw,ubuntu-no-docker-preflight-negative,ubuntu-invalid-nvidia-key-negative

github-actions · 2026-06-02T07:39:36Z

E2E Scenario Advisor Recommendation

Required scenario E2E: e2e-scenarios-all
Optional scenario E2E: None

Dispatch required scenario E2E:

gh workflow run e2e-scenarios-all.yaml --ref <pr-head-ref>

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/ci/e2e-scenario-dry-run-summary
Head: HEAD
Confidence: high

Required scenario E2E

e2e-scenarios-all: Scenario runtime runner code changed in test/e2e-scenario/runtime/run-scenario.sh, adding onboarding assertion execution across normal and negative scenario paths. Runtime/runner changes can affect every scenario, so the full scenario fan-out is required.
- Dispatch: gh workflow run e2e-scenarios-all.yaml --ref <pr-head-ref>

Optional scenario E2E

None.

Relevant changed files

test/e2e-scenario/framework-tests/e2e-scenario-first-migration.test.ts
test/e2e-scenario/runtime/run-scenario.sh

github-actions · 2026-06-02T07:39:50Z

PR Review Advisor

Findings: 0 needs attention, 4 worth checking, 0 nice ideas
Top item: Constrain onboarding assertion script execution to trusted paths

Review findings

🛠️ Needs attention

None.

🔎 Worth checking

Source-of-truth review needed: Onboarding assertion script resolution in run-scenario.sh: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: `run_onboarding_assertions()` parses `plan.json` for IDs, parses `scenarios.yaml` for scripts, and then executes the resulting path.
Constrain metadata-driven assertion script execution (test/e2e-scenario/runtime/run-scenario.sh:275): The new runner resolves an assertion script from repository YAML and executes `bash "${full_path}"` after prefixing it with `E2E_ROOT`. The value is quoted, so this is not direct shell metacharacter injection, but the execution point does not normalize the path or enforce that it remains under an expected directory such as `test/e2e-scenario/onboarding_assertions/`. It also re-reads `scenarios.yaml` instead of executing already-resolved assertion metadata from `plan.json`, leaving the runner with a second source of truth for script paths.
- Recommendation: Before executing, validate assertion IDs and script paths against a narrow allowlist: normalize/realpath the path, reject absolute paths and `..`, require the resolved file to be under the onboarding assertions directory, and consider having the resolver emit `{id, script, assertion_id}` into `plan.json` so the runner executes the resolved plan rather than re-parsing YAML.
- Evidence: `run_onboarding_assertions()` loads `scenarios?.onboarding_assertions?.[id]`, reads `assertion.script`, builds `full_path="${E2E_ROOT}/${script}"`, checks only `-f`, and then runs `bash "${full_path}"`.
Add negative and live-mode coverage for onboarding assertion execution (test/e2e-scenario/framework-tests/e2e-scenario-first-migration.test.ts:85): The committed test update verifies only the positive dry-run path where assertion scripts are intentionally skipped. The new behavior also has important live and failure semantics: unknown assertion IDs, missing or non-contained scripts, assertion script failures, and negative-preflight assertion execution should fail closed at the intended point.
- Recommendation: Add framework tests that exercise `run_onboarding_assertions()` through `run-scenario.sh` with a temporary or fixture plan/metadata path if needed: successful live execution of a harmless assertion, unknown assertion ID, missing script, rejected path traversal/non-contained script, script failure exit propagation, and the negative-preflight dry-run path claimed in the PR verification.
- Evidence: The added assertions check `== onboarding assertions ==` and dry-run PASS output for `ubuntu-repo-cloud-openclaw`; there is no committed test covering actual `bash "${full_path}"` execution or the error branches in `run_onboarding_assertions()`.
Live preflight assertion may false-fail on benign Docker/container log text (test/e2e-scenario/runtime/run-scenario.sh:475): This PR begins running the existing `preflight-passed` assertion in live positive scenarios. That assertion fails if `onboard.log` contains broad terms such as `docker`, `container`, `daemon`, or `socket`, even if the onboarding completed successfully and the log only contains informational text. The new call site makes that broad grep part of the live scenario gate.
- Recommendation: Tighten the assertion to match explicit failure/error patterns rather than standalone infrastructure terms, or base it on structured onboarding/preflight status where available. Add a regression test with a successful onboarding log that mentions Docker/container informationally.
- Evidence: `run-scenario.sh` now calls `run_onboarding_assertions()` after successful onboarding; `onboarding_assertions/preflight/00-preflight-passed.sh` fails on `grep -Eiq "preflight.*(fail|error)|docker|container|daemon|socket" "${E2E_CONTEXT_DIR}/onboard.log"`.

🌱 Nice ideas

None.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

wscurran · 2026-06-03T18:11:54Z

✨
Related open issues:

#3588 Implement layered E2E scenario model

test(e2e): run onboarding assertions in scenario runner

30b6d2a

cv self-assigned this Jun 2, 2026

wscurran added area: e2e End-to-end tests, nightly failures, or validation infrastructure feature PR adds or expands user-visible functionality labels Jun 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(e2e): run onboarding assertions in scenario runner#4657

test(e2e): run onboarding assertions in scenario runner#4657
cv wants to merge 1 commit into
ci/e2e-scenario-dry-run-summaryfrom
test/e2e-onboarding-assertions-runner

cv commented Jun 2, 2026

Uh oh!

copy-pr-bot Bot commented Jun 2, 2026

Uh oh!

coderabbitai Bot commented Jun 2, 2026

Review skipped

Uh oh!

github-actions Bot commented Jun 2, 2026

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented Jun 2, 2026

E2E Scenario Advisor

Required scenario E2E

Optional scenario E2E

Relevant changed files

Uh oh!

github-actions Bot commented Jun 2, 2026

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

wscurran commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cv commented Jun 2, 2026

Summary

Related Issue

Changes

Type of Change

Verification

Uh oh!

copy-pr-bot Bot commented Jun 2, 2026

Uh oh!

coderabbitai Bot commented Jun 2, 2026

Review skipped

Uh oh!

github-actions Bot commented Jun 2, 2026

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented Jun 2, 2026

E2E Scenario Advisor Recommendation

E2E Scenario Advisor

Required scenario E2E

Optional scenario E2E

Relevant changed files

Uh oh!

github-actions Bot commented Jun 2, 2026

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

wscurran commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants