ci/security/test: verify strategy profile sync after deploy#139
Conversation
🤖 Codex PR ReviewPlease ensure a human reviewer checks this PR before merging. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 27e162e29e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if [ "$actual_count" = "$expected_count" ]; then | ||
| break |
There was a problem hiding this comment.
Verify profile content, not just count
When a push changes strategy-profiles.example.json without changing the number of entries, an old Worker during deployment propagation can still return the same strategy_profiles_count, so this exits the retry loop before the newly deployed Worker has written its bundled profiles. In worker.js:1092-1120 the sync writes DEFAULT_STRATEGY_PROFILES but this workflow only observes the count, and the later live check also compares only length, so the job can succeed while KV remains on stale profile metadata.
Useful? React with 👍 / 👎.
| if [ "$live_count" != "$expected_count" ]; then | ||
| echo "Live strategy profile count $live_count; expected $expected_count." >&2 | ||
| exit 2 |
There was a problem hiding this comment.
Retry the live KV read before failing
When the sync write succeeds but this immediate public read is served from an edge location with a cached strategy_profiles KV value, Cloudflare KV reads can remain eventually consistent for 60s+, so failing on the first live_count mismatch can mark a valid deploy as failed. The sync POST is retried for propagation, but the live /api/strategy-profiles verification is not, which makes this new release gate flaky after real profile-count changes.
Useful? React with 👍 / 👎.
| import sys | ||
|
|
||
| payload = json.load(open(sys.argv[1], encoding="utf-8")) | ||
| print(payload.get("strategy_profiles_count", "")) |
There was a problem hiding this comment.
Fail when the sync endpoint skips KV
If the deployed Worker lacks the STRATEGY_SWITCH_CONFIG binding, syncStrategyProfilesResponse returns HTTP 200 with ok: true, strategy_profiles_sync.skipped: true, and the bundled count (worker.js:1077-1088). Because this script extracts only strategy_profiles_count, it reports Strategy profile KV sync verified even though no KV write occurred and the public endpoint can still pass by serving the bundled fallback.
Useful? React with 👍 / 👎.
Summary
/api/strategy-profilesreturns the expected profile count after sync.Problems found
wrangler deploy, which can hit the previous Worker version during edge propagation.Fixes applied
strategy-profiles.example.json./api/internal/sync-strategy-profilesuntil the response count matches the bundle./api/strategy-profilescount does not match.Security impact
Architecture impact
Tests run
actionlintbash -non the extracted sync step script.venv/bin/python -m unittest python.tests.test_runtime_settings.RuntimeSettingsTest.test_strategy_switch_console_deploy_workflow_syncs_bundled_profiles -v.venv/bin/python -m unittest discover -s python/tests -vuvx ruff check pythonuvx ruff format --check pythongit diff --checkFailed or skipped checks with reasons
mainworkflow.Deployment notes
.github/workflows/deploy-strategy-switch-console.ymland its workflow regression test.Rollback plan
Manual follow-up checklist
Strategy profile KV sync verified with 19 profiles./api/strategy-profilesincludes the current bundle after deployment.