test: harden assertion shape across admin + auth + security specs#64
Open
awais786 wants to merge 4 commits into
Open
test: harden assertion shape across admin + auth + security specs#64awais786 wants to merge 4 commits into
awais786 wants to merge 4 commits into
Conversation
Whole-suite review (docs/test-review-2026-06-10.md) found no Critical
shape bugs but a cluster of over-loose assertions that would pass on a
real-but-narrow regression. This tightens the seven highest-signal ones
and closes the two genuine per-app coverage soft spots (SurfSense and
Twenty admin distinctions).
- surfsense-admin: owner positive now also asserts the role-change
button is enabled (actionable), not just present — the owner/non-owner
distinction was leaning on a single getByRole("button") count.
- twenty-admin: admin positive no longer accepts a bare 2xx from
/admin-panel-graphql-api as proof of admin. GraphQL answers 200 even
on AdminPanelGuard denial (errors[] + null data); require a real data
payload with no errors.
- layer2-re-establish: record IDP-host hits during the reload and assert
zero. Previously an app that re-established by silently re-running the
full SSO/IDP bounce would land on-host and pass identically; the
contract is re-establishment FROM HEADERS, no IDP bounce.
- session-store-redis-backed: add a post-navigation split-form cookie
check (the genuinely falsifiable signal); the steady-state size-
equality check alone cannot grow an already-populated worker session.
- pm-godmode: pin the CSRF endpoint to exactly 200 instead of < 400 so a
3xx into the auth chain can't pass as a clean bypass.
- per-app-cookie-theft: for apps known to be cookie-stateful (PM,
Outline, Penpot, SurfSense) fail loud when zero session cookies are
found instead of skipping as "vacuously safe" — a renamed cookie was a
silent coverage hole. Twenty (localStorage-token auth) still skips.
No @SPEC tags changed; spec-coverage audit clean (0 missing).
SSO Spec Coverage AuditContract source:
All 88 spec requirements are accounted for. |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR tightens several high-signal Playwright assertions across app/admin/auth/security specs to avoid “passes on narrow regressions,” and adds a written suite review documenting findings and rationale.
Changes:
- Harden per-app cookie theft/attribute specs by failing loudly when known cookie-stateful apps unexpectedly produce zero matching session cookies.
- Strengthen auth lifecycle and session-store checks (IDP-bounce detection on layer-2 re-establish; post-navigation split-cookie detection for Redis-backed session store).
- Tighten app-specific assertions (Twenty admin GraphQL signal requires real
dataand noerrors; SurfSense owner role-change control must be enabled; PM CSRF endpoint pinned to HTTP 200).
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/security/per-app-cookie-theft.spec.ts | Adds “known stateful apps” guard to prevent silent skips when session cookie patterns no longer match. |
| tests/auth/session-store-redis-backed.spec.ts | Adds post-navigation split-cookie check and refines the size-stability assertion messaging. |
| tests/auth/layer2-re-establish.spec.ts | Records and asserts zero IDP-host responses during reload to distinguish header-based re-establish from hidden re-login. |
| tests/apps/twenty-admin.spec.ts | Treats /admin-panel-graphql-api as an admin signal only when HTTP 200 includes non-null data and no errors. |
| tests/apps/surfsense-admin.spec.ts | Strengthens owner-positive assertion by requiring a role-change button to be enabled (actionable). |
| tests/apps/pm-godmode.spec.ts | Pins CSRF endpoint response to exactly 200 to prevent redirects passing as bypass. |
| docs/test-review-2026-06-10.md | Adds documented suite review capturing findings, severity, and follow-ups. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+117
to
+119
| // The SSO cookie value must stay byte-identical: a cookie-backed store | ||
| // would re-encode / refresh the claim payload as the per-app | ||
| // middlewares round-trip identity, changing the value length. |
added 3 commits
June 10, 2026 18:46
Whole-suite review (docs/test-review-2026-06-10.md) found no Critical shape bugs but a cluster of over-loose assertions that would pass on a real-but-narrow regression. Tighten the highest-signal ones; all changes validated against the live sandbox (headless). - surfsense-admin: owner positive also asserts the role-change button is enabled (actionable), not just present. - twenty-admin: admin positive no longer accepts a bare 2xx from /admin-panel-graphql-api as proof of admin (GraphQL answers 200 even on AdminPanelGuard denial); require a real data payload with no errors. - layer2-re-establish: record IDP-host hits during the reload and assert zero — an app that "re-establishes" by silently re-running the full SSO bounce would otherwise land on-host and pass identically. - session-store-redis-backed: add a post-navigation split-form cookie check (the genuinely falsifiable signal); make the cross-app nav loop tolerant of Twenty's documented net::ERR_ABORTED via waitUntil:commit. - pm-godmode: pin the CSRF endpoint to exactly 200 (not < 400) so a 3xx into the auth chain can't pass as a clean bypass. - per-app-cookie-theft: warm each app's session cookie into the browser jar (PM/Outline/Penpot set it on an authenticated API call, not the bare page load) so the theft + attribute assertions actually RUN — previously they silently skipped for PM/Outline. For known-stateful apps, fail loud when no session cookie is found. SurfSense + Twenty are excluded: both authenticate off the proxy-injected identity and set no per-app session cookie (verified against the live sandbox). No @SPEC tags changed; spec-coverage audit clean (0 missing).
Closes the "Partially covered" deferred gap on session-lifecycle#layer-2-session-renewal-shall-be-guarded-against-three-regression-paths. The renewal middleware (must NOT re-issue a session cookie on a 4xx) is a cross-app pattern; the test previously pinned only Penpot (auth-token). Parameterize over all three cookie-stateful apps — Penpot (auth-token), Outline (accessToken), Plane (session-id) — each provoking a clean 4xx through the renewal-owning middleware via an unknown RPC command / API method / route. All three pass against the live sandbox. Live-verified details baked in: Plane's cookie is `session-id` (hyphenated) and is set only after an authenticated API call, so its probe warms via /api/users/me/ first. Twenty (localStorage token) and SurfSense (proxy-header auth, no session cookie) are out of scope. Each app's 4xx is asserted as a pre-condition, so a non-4xx response fails loud rather than passing vacuously. Deferred-doc note moved from "Partially covered" to "Covered"; spec-coverage.md regenerated. No @SPEC tags changed; audit clean.
Installs deps + Playwright browsers and seeds .env from .env.example on first run. If npm is missing, installs Node via Homebrew when available, otherwise prints nvm / NPM-override guidance and exits. Idempotent: skips the .env copy when one already exists.
Comment on lines
+66
to
+74
| // Warm the per-app session cookie INTO the browser context's jar. Unlike | ||
| // Penpot (whose SPA fires an authenticated RPC on page load), PM / | ||
| // Outline / SurfSense create their session only on an explicit | ||
| // authenticated API call and set the cookie on THAT response — a bare | ||
| // page load leaves only `_oauth2_proxy`. `context.request` shares the | ||
| // context cookie jar, so the session cookie lands where | ||
| // `context.cookies()` can see it. Without this warm, the theft + | ||
| // attribute assertions below silently skip for those three apps (the | ||
| // exact coverage hole the KNOWN_STATEFUL_APPS guard now fails loud on). |
Comment on lines
+125
to
+127
| // The SSO cookie value must stay byte-identical: a cookie-backed store | ||
| // would re-encode / refresh the claim payload as the per-app | ||
| // middlewares round-trip identity, changing the value length. |
| - `mPass-side session revocation SHALL be honoured on next refresh` — **Cognito-side** — requires admin API call to revoke a refresh token; not in CI scope today. | ||
| - `per-app session TTLs SHALL be uniformly configurable` — **Infra-only**. | ||
| - `Layer-2 session renewal SHALL be guarded against three regression paths` — **Partially covered** — `tests/auth/layer2-renewal-suppressed-on-4xx.spec.ts` pins the Penpot/`auth-token` case (4xx response must carry no fresh session cookie). Extending to Outline (`accessToken`) and Plane (`sessionid`) is the same shape with different per-app cookie names. | ||
| - `Layer-2 session renewal SHALL be guarded against three regression paths` — **Covered** — `tests/auth/layer2-renewal-suppressed-on-4xx.spec.ts` pins the 4xx-must-carry-no-fresh-session-cookie invariant across all three cookie-stateful apps: Penpot (`auth-token`), Outline (`accessToken`), and Plane (`session-id`), via an unknown RPC command / API method / route respectively. Twenty (localStorage token pair, no session cookie) and SurfSense (Django path mirrors Plane) are out of scope. The time-based "stale cookie past renewal threshold" axis remains a TTL-bundle gap. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Whole-suite review (docs/test-review-2026-06-10.md) found no Critical shape bugs but a cluster of over-loose assertions that would pass on a real-but-narrow regression. This tightens the seven highest-signal ones and closes the two genuine per-app coverage soft spots (SurfSense and Twenty admin distinctions).
No @SPEC tags changed; spec-coverage audit clean (0 missing).
Motivation / Background
This Pull Request has been created because:
Detail
This Pull Request changes:
Additional information
TIP: Provide additional information such as screenshots, benchmarks, reference to other repositories or alternative solutions
Checklist
Before submitting the PR make sure the following are checked:
[Fix #issue-number]