feat(testing): add cfn_handler.testing module with replay() and helpers by igorlg · Pull Request #24 · igorlg/cfn-handler

igorlg · 2026-05-22T06:11:39Z

Issue

No tracked issue — cfn-handler users currently have no first-class
way to unit-test their custom-resource handlers without reaching into
cfn_handler._internal/ (which the README explicitly says is unstable).
The legacy test_mode=True flag introduced during early v1 development
worked around this but accumulated known issues: mutable state on the
resource, an awkward sentinel string for the polling-defer case, and an
inability to test polling re-invocation because the marker keys are
never written to the event in test mode.

Planned in OpenSpec change add-testing-helpers.
Roadmap context: docs/ROADMAP.md (this is item #1
of the v1.x evolution; item #2 is better logging, item #3 is optional
idempotency).

Summary

Adds a public cfn_handler.testing module so users can unit-test
custom-resource handlers in-process: no HTTP, no boto3, no moto
setup. The headline API is CustomResource.replay(event, context=None)
which executes the full dispatch pipeline and returns a structured
Replay value. Polling-aware: a deferred replay mutates the event
with marker keys and returns Replay(status="DEFERRED"); a follow-up
replay() of the mutated event resumes through the poll handler —
without provisioning a real EventBridge rule.

Soft-deprecates the existing test_mode / last_response surface
(continues to work in v1.x; emits DeprecationWarning; removed in v2.0).
The internal test suite is migrated onto the new API in the same PR
to dogfood it.

Changes

src/cfn_handler/testing/                       NEW public module
├── __init__.py                                Re-exports __all__
├── py.typed
├── fixtures.py                                pytest11 entry point
└── _internal/
    ├── replay_result.py                       Replay frozen dataclass
    ├── runner.py                              run_replay() driver
    ├── event_factory.py                       make_event()
    ├── context_factory.py                     make_context()
    └── assertions.py                          assert_success / _failed / _deferred

src/cfn_handler/resource.py
├── + Transport / PollerProvision / PollerTeardown seam kwargs
│     (late-bound defaults; existing patches keep working)
├── + _replay_seams() context manager
├── + replay() public method
└── + DeprecationWarning when test_mode=True

src/cfn_handler/_internal/{response,poller}.py
└── + Transport, PollerProvision, PollerTeardown type aliases

pyproject.toml                                 + [pytest11] entry point
justfile                                       test-cov uses `coverage run`
.github/workflows/ci.yml                       same fix in CI
README.md                                      + Testing section

tests/                                         migrated + new
├── unit/test_resource.py                      ~104 refs migrated to replay()
├── unit/test_backstops.py                     test_mode-specific test moved
├── unit/test_polling_dispatch.py              sentinel test moved
├── unit/test_state_machine.py                 hypothesis tests migrated
├── unit/test_test_mode_deprecation.py         NEW (deprecation tests)
├── unit/test_transport_seam.py                NEW
├── unit/test_poller_seam.py                   NEW
├── unit/testing/                              NEW (4 files, 42 tests)
├── integration/test_replay_parity.py          NEW (production-vs-replay)
└── integration/test_fixture_discovery.py      NEW (pytest11 entry point)

Public API surface (`cfn_handler.testing`)

Name	Kind	Purpose
`Replay`	frozen dataclass	Outcome of a replay (status / data / reason / no_echo / payload / ...)
`make_event(...)`	factory	Canonical CFN custom-resource event with safe defaults
`make_context(...)`	factory	Minimal `LambdaContext`-protocol object
`assert_success`	helper	Pass when status=SUCCESS, optionally match data / pid / no_echo
`assert_failed`	helper	Pass when status=FAILED, optionally match `reason_contains`
`assert_deferred`	helper	Pass when status=DEFERRED (replay-only sentinel)
`cfn_*_event`	pytest fixture	Auto-discovered via `pytest11`
`cfn_lambda_context`	pytest fixture	Auto-discovered via `pytest11`

The module supports zero-pytest, zero-boto3 environments — the entry
point only loads inside a pytest run, and boto3 is never imported
on the non-polling path.

One-line example

from cfn_handler import CustomResource
from cfn_handler.testing import assert_success, make_event

resource = CustomResource()

@resource.create
def on_create(event, ctx):
    return {"Endpoint": "https://x.example"}

assert_success(resource.replay(make_event()), data={"Endpoint": "https://x.example"})

Migration for users on `test_mode`

Mechanical translation:

# Before (still works in v1.x, emits DeprecationWarning)
resource = CustomResource(test_mode=True)
resource(event, ctx)
assert resource.last_response["Status"] == "SUCCESS"
assert resource.last_response["Data"] == {"x": 1}

# After
resource = CustomResource()
replay = resource.replay(event, ctx)
assert replay.status == "SUCCESS"
assert replay.data == {"x": 1}
# or, more idiomatically:
assert_success(replay, data={"x": 1})

The polling case improves materially: where test_mode only wrote a
{"__cfn_handler_polling__": True} sentinel and short-circuited the
provisioning, replay() now mutates the event with the real marker
keys, so a follow-up replay() correctly resumes through the poll
handler — letting users test both halves of a polled lifecycle without
moto.

Tests

New tests added for new behaviour (~50 across replay, factories, assertions, fixtures, parity)
Existing tests still pass after migration onto the new API (~100 sites translated)
just test-cov — 155 passing, 98% line+branch coverage (gate: 95%)
just typecheck — mypy strict + pyright strict, 16 source files, no issues
just lint — ruff + ruff format + cfn-lint, all checks passed
just gha-pre-release — all locally-replayable gating jobs green (CodeQL skipped under act due to its post-analysis REST API call against a synthesized run id; validated by real GH Actions on this PR — see commit ac19152)
Manual smoke test in a fresh venv outside the repo: replay(make_event(), make_context()) returns SUCCESS, fixtures auto-discover

Coverage measurement note

Switched test-cov to coverage run -m pytest (from pytest --cov) in
both justfile and ci.yml. The new pytest11 entry point causes
pytest to import cfn_handler.testing.fixtures (and transitively
cfn_handler itself) at plugin-collection time — before pytest-cov
attaches its instrumentation hooks. Module-level code in __init__.py
then runs uninstrumented and the report shows artificial 0% coverage on
those lines, dropping aggregate to ~68%. coverage run -m pytest
attaches the tracer before any imports. Documented in both the recipe
and the workflow.

Breaking changes?

No. v1.3 ships the new replay() API as purely additive. The
test_mode flag and last_response attribute continue to function
exactly as before — they just emit a DeprecationWarning directing
users to the new API. Removal is scheduled for v2.0 (separate change,
no timeline).

The internal test suite is migrated as part of this PR, but that's
a project-internal concern, not a breaking change for downstream
consumers.

Checklist

Conventional Commits prefix in the PR title (feat(testing):)
CHANGELOG entry will be generated automatically by release-please
Documentation updated (README "Testing" section, docstrings on every public name in cfn_handler.testing, docs/ROADMAP.md updated to reflect "shipped in v1.3.0")
OpenSpec change opened (add-testing-helpers, validates --strict)

Commit-by-commit story (for review)

The squash-merge will collapse these, but each is reviewable on its own
and leaves the suite in a green state at every checkpoint:

chore(openspec): propose v1.3 testing helpers + roadmap — plan only
feat(testing): add cfn_handler.testing module with replay() and helpers — the implementation
feat(resource): deprecate test_mode and last_response in favour of replay() — DeprecationWarning + tests + temporary filterwarnings ignore for the migration window
test: migrate existing test suite from test_mode to replay() — bulk translation, removes the temporary filter
docs(readme): add Testing section showing replay() workflow
ci(justfile): skip CodeQL in gha-pre-release (act/CodeQL incompatibility) — tangential discovery, kept on the same branch since it directly affects validation of this PR

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ac19152e16

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-22T06:18:34Z

-                polling_interval_minutes=self._polling_interval_minutes,
-            )
+            provision = self._provision_poller if self._provision_poller is not None else setup_polling
+            provision(event, context.function_name, self._polling_interval_minutes, None)


Preserve legacy poller callable arity in provisioning call

The provisioning path now always calls provision(..., None) with four positional arguments. Any existing monkey-patched setup_polling (or custom callable) that still uses the prior 3-argument shape will raise TypeError; this exception is not handled in this block (only CfnHandlerError is), so dispatch falls into the generic backstop and returns an internal-error failure instead of the expected polling behavior. Using keyword args (or omitting the explicit region when not needed) avoids this regression.

Useful? React with 👍 / 👎.

Adds the OpenSpec change `add-testing-helpers` documenting: * proposal.md - rationale, scope (replay(), Replay, factories, assertions, pytest fixtures), explicit non-goals (no integration helpers, no fluent builders, no async support), and the soft-deprecation path for the existing `test_mode` flag. * design.md - architecture, the dispatch flow diagram, eight key decisions each with at least one rejected alternative, risks, and open questions reserved for the implementation phase. * specs/testing-helpers/spec.md - 9 ADDED requirements with ~25 scenarios covering the full public contract. * tasks.md - 13 phases / 43 verifiable tasks in TDD order. Adds `docs/ROADMAP.md` capturing the wider library-surface evolution discussion: testing helpers (this change), better logging, optional idempotency module, typed events for v2.0, and the cfn-lint plugin parallel-track work. The Decision Log section records explicit non-goals (CDK construct, SFN polling, CFN macros, async handlers) with rationale. This commit lands the plan only - no implementation. Subsequent commits in this PR implement it, ending with the test-suite migration.

Introduces a public testing surface so users can unit-test custom-resource handlers without HTTP, without boto3, and without reaching into `cfn_handler._internal/`. Public API (`cfn_handler.testing`): * `Replay` - frozen dataclass capturing the dispatch outcome (status / data / reason / no_echo / payload / request_type / physical_resource_id). The `status` field is one of "SUCCESS" / "FAILED" / "DEFERRED"; the last is a replay-only sentinel signalling "would have entered polling". * `CustomResource.replay(event, context=None)` - drives the full dispatch pipeline in-process using internal seams to capture the response payload. No HTTP, no boto3 import. Polling is stubbed: a deferred replay mutates the event with marker keys and returns `Replay(status="DEFERRED")`; a follow-up `replay()` with the mutated event correctly resumes through the poll handler. * `make_event(...)` / `make_context(...)` - factories with safe defaults (RFC 6761 `example.invalid`, AWS-reserved 111111111111 account ID). `make_event` enforces `physical_resource_id` for Update/Delete events. * `assert_success` / `assert_failed` / `assert_deferred` - pytest-style assertion helpers with informative AssertionError messages on mismatch. * pytest fixtures `cfn_create_event`, `cfn_update_event`, `cfn_delete_event`, `cfn_lambda_context` - auto-discovered via a new `pytest11` entry point. No `pytest_plugins` declaration required in user conftest. Internal seams (private API, used by replay() but also available to power users via the constructor kwargs `transport=`, `provision_poller=`, `teardown_poller=`): * `Transport` - `Callable[[str, dict], None]` replacing the default urllib PUT (`send_response`). Late-bound default lookup so existing tests that `patch("cfn_handler.resource.send_response")` continue to work. * `PollerProvision` / `PollerTeardown` - mirror seams for the boto3-using polling provisioning/teardown calls. Late-bound for backwards compatibility with existing patches. * `CustomResource._replay_seams(...)` - context manager that swaps all three seams atomically and restores them on exit (including exceptional return). Used by the runner; not part of the public API contract. CI / tooling: * `pyproject.toml`: registers the `pytest11` entry point. * `justfile` + `.github/workflows/ci.yml`: switch `test-cov` from `pytest --cov` to `coverage run -m pytest`. The pytest11 entry point causes pytest to import `cfn_handler.testing.fixtures` (and transitively `cfn_handler`) during plugin collection, BEFORE the pytest-cov instrumentation hooks attach. Module-level code in `__init__.py` then runs uninstrumented and the report shows artificial 0% on those lines, dropping aggregate coverage to ~68%. `coverage run` ensures the tracer is active before any imports. Documented in both the recipe and the workflow step. Tests added: * `tests/unit/test_transport_seam.py` - the seam intercepts; default behaviour preserved. * `tests/unit/test_poller_seam.py` - poller stubs work; boto3 is never imported during a stubbed deferral. * `tests/unit/testing/test_replay.py` - SUCCESS/FAILED replays; `Replay` is frozen; no HTTP I/O; default context fallback. * `tests/unit/testing/test_replay_polling.py` - DEFERRED status, event mutation, two-step deferral->resume flow. * `tests/unit/testing/test_factories.py` - make_event / make_context with overrides + validation. * `tests/unit/testing/test_assertions.py` - all helpers, both pass and fail paths, message contents on failure. * `tests/integration/test_replay_parity.py` - same handler via `__call__` (moto + fake transport) vs `replay()` produce equivalent payloads. * `tests/integration/test_fixture_discovery.py` - pytest11 entry point auto-discovers fixtures in a fresh subprocess project, fixture invocations are independent. No breaking changes. The legacy `test_mode` flag continues to work. The deprecation of `test_mode` and `last_response` lands in the next commit.

…play() The legacy `test_mode=True` constructor flag and `last_response` capture attribute are superseded by the new `replay()` method shipped in the previous commit. They had known issues: * Mutable state on the resource (tests must reset `last_response` between assertions or risk false positives). * Sentinel string `__cfn_handler_polling__` for the polling-defer case lives on the public `last_response` surface with no type or documentation guarantees. * Polling re-invocation can't be tested: `test_mode` short-circuits `setup_polling` entirely, so the marker keys never get added to the event and a follow-up dispatch can't be simulated. This commit: * Emits a `DeprecationWarning` from `CustomResource.__init__` when `test_mode=True`, with a message pointing at `replay()` and the `cfn_handler.testing` module. `stacklevel=2` so the warning surfaces at the user's call site. * Updates the docstrings on `test_mode` (constructor parameter) and `last_response` (instance attribute) to mark them as deprecated and reference `replay()`. * Adds `tests/unit/test_test_mode_deprecation.py` covering the warning, that `test_mode=False` does NOT warn, and that the legacy behaviour still functions verbatim (so existing user code keeps working in v1.x). Removal is scheduled for v2.0 (separate change). The behaviour is unchanged in v1.3 - users see only the warning. The internal test suite migrates onto `replay()` in the next commit; until that lands the project's own pytest `filterwarnings` would promote the warning to an error in tests that still use `test_mode=True`. To avoid artificial failures during the migration window, that single warning is filtered to `ignore` in pyproject.toml's `[tool.pytest.ini_options] .filterwarnings` block - reverted in the migration commit.

Bulk migration of the 100+ pre-existing references to `CustomResource(test_mode=True)` and `resource.last_response` over to the new `replay()` API. Verbatim translation: CustomResource(test_mode=True) → CustomResource() resource(event, ctx) → replay = resource.replay(event, ctx) resource.last_response["Status"] == "X" → replay.status == "X" resource.last_response["Data"] → replay.data resource.last_response["Reason"] → replay.reason resource.last_response["PhysicalResourceId"] → replay.physical_resource_id resource.last_response["NoEcho"] → replay.no_echo Where the assertion shape matches, the migration uses the helpers `assert_success(replay, data=...)` / `assert_failed(replay, reason_contains=...)` for clearer intent and informative error messages. Files migrated: * tests/unit/test_resource.py - decorator registration tests, lifecycle dispatch, exception handling, PhysicalResourceId semantics, init_failure, log_level acceptance. * tests/unit/test_backstops.py - the lone test_mode-specific test (`test_safe_teardown_skipped_in_test_mode`) was moved to `test_test_mode_deprecation.py` since it tests deprecated behaviour we keep working until v2.0. * tests/unit/test_polling_dispatch.py - the `__cfn_handler_polling__` sentinel test (also deprecated-only behaviour) was moved to test_test_mode_deprecation.py. * tests/unit/test_state_machine.py - hypothesis-driven invariant tests, all migrated to inspect `Replay` fields rather than `last_response`. After this commit `grep -rn 'test_mode\|last_response' tests/` returns hits only from `test_test_mode_deprecation.py` (intentional, testing the deprecated path) and a single docstring mention in `test_resource.py`. The temporary `filterwarnings` exception added in the previous commit is removed - any new test using `test_mode=True` would now fail the suite (as intended), forcing the test author to use the new API. Coverage stays at 98%.

Adds a 'Testing your handlers' section between 'Examples' and 'Project status' that: * Shows a minimal one-screen example: build a CustomResource, register a handler, call `replay(make_event())`, assert via `assert_success`. * Lists the public testing surface (`replay`, `make_event`, `make_context`, the three assertion helpers, and the four auto-discovered pytest fixtures). * Calls out the polling-deferral semantics: first `replay()` returns `Replay(status='DEFERRED')` and mutates the event; a second `replay()` resumes through the poll handler. Useful for testing both halves of a polled lifecycle without provisioning EventBridge rules. The detailed contract lives in docstrings, the spec, and the roadmap doc; the README only carries enough context to nudge a new reader toward unit-testing their handlers from day one.

…ity) The CodeQL GitHub Action's post-analysis step calls the GH REST API (`/repos/{owner}/{repo}/actions/runs/{run_id}`) for telemetry and status-page reporting. Under `act`, the synthesized GITHUB_RUN_ID doesn't exist on github.com, so the call 404s and the action sets the job status to JOB_STATUS_CONFIGURATION_ERROR even when: * 174/174 queries loaded successfully * 42/42 Python files extracted * SARIF generated and post-processed * Zero findings in our code The result is gha-pre-release ALWAYS failing on the codeql step under act, which masks real failures in upstream steps. Cleanest fix: drop CodeQL from the local replay sequence with a clear note about why, and rely on the real GH Actions run on every PR (which has actual access to the workflow-runs endpoint via GITHUB_TOKEN). Changes: * Recipe header documents the skip with rationale and a one-liner showing how to run CodeQL locally on demand for ad-hoc inspection. The user runs `act push` directly and inspects the SARIF; a configuration-error exit with a successfully-generated SARIF means no findings. * Step counters renumbered from N/7 to N/6 throughout the recipe. * Step 5 prints an explicit SKIPPED notice explaining the situation so the user isn't left wondering whether CodeQL ran. * Final "safe to merge" message clarifies CodeQL still gates merge on the actual PR via real GH Actions. Discovered while running gha-pre-release on the testing-helpers PR - CodeQL kept reporting failure despite analyzing cleanly.

Move `ReplayRequestType` and `ReplayStatus` into the `if TYPE_CHECKING:` block. They were imported at module scope but only referenced in annotations (resolved as strings via `from __future__ import annotations`) and in a string-form `cast("ReplayRequestType", ...)` call (per ruff's TC006 rule, which prefers the string form to keep type-only symbols out of the runtime import graph). CodeQL's py/unused-import sees the runtime import and the string-only references, and reasonably concludes the import is unused. Cleaner resolution: keep the imports type-only, expand the explanatory comment so future readers see why both static analyzers (CodeQL + ruff TC006) end up happy with this shape. The two CodeQL py/cyclic-import alerts on the surrounding `if TYPE_CHECKING:` block remain false positives — CodeQL does not model conditional imports, and the runtime cycle is broken by the lazy import in `CustomResource.replay` (resource.py:354). Both have been dismissed in the GitHub UI with rationale (alerts #15, #16).

PR #24 (commit f0f9507) shipped the implementation: the `cfn_handler.testing` module with `replay()`, `Replay`, `make_event` / `make_context`, the three assertion helpers, and auto-discovered pytest fixtures, plus the soft-deprecation of `test_mode` / `last_response`. This archive: * Moves the change to `openspec/changes/archive/2026-05-25-add-testing-helpers/`, preserving proposal / design / tasks / spec for historical reference. * Promotes the delta spec to a new top-level capability at `openspec/specs/testing-helpers/spec.md` (verbatim copy of the ADDED requirements + a fresh Purpose section). `openspec validate testing-helpers --type spec --strict` passes. `openspec list` is now empty — no active changes.

github-advanced-security AI found potential problems May 22, 2026

View reviewed changes

Comment thread src/cfn_handler/resource.py Dismissed

Comment thread src/cfn_handler/testing/_internal/runner.py Fixed

Comment thread src/cfn_handler/testing/_internal/runner.py Fixed

chatgpt-codex-connector Bot reviewed May 22, 2026

View reviewed changes

github-advanced-security AI found potential problems May 22, 2026

View reviewed changes

Comment thread src/cfn_handler/testing/_internal/runner.py Dismissed

Comment thread src/cfn_handler/testing/_internal/runner.py Dismissed

igorlg added 7 commits May 22, 2026 16:28

igorlg force-pushed the feat/add-testing-helpers branch 2 times, most recently from a6c3e20 to 0f4cbf4 Compare May 22, 2026 06:34

igorlg merged commit f0f9507 into main May 22, 2026
16 checks passed

cfn-handler-release-bot Bot mentioned this pull request May 22, 2026

chore(main): release 1.3.0 #25

Merged

igorlg deleted the feat/add-testing-helpers branch May 22, 2026 06:36

igorlg mentioned this pull request May 25, 2026

chore(openspec): archive add-testing-helpers #26

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(testing): add cfn_handler.testing module with replay() and helpers#24

feat(testing): add cfn_handler.testing module with replay() and helpers#24
igorlg merged 7 commits into
mainfrom
feat/add-testing-helpers

igorlg commented May 22, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 22, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

igorlg commented May 22, 2026

Issue

Summary

Changes

Public API surface (cfn_handler.testing)

One-line example

Migration for users on test_mode

Tests

Coverage measurement note

Breaking changes?

Checklist

Commit-by-commit story (for review)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Public API surface (`cfn_handler.testing`)

Migration for users on `test_mode`