Skip to content

test: harden assertion shape across admin + auth + security specs#64

Open
awais786 wants to merge 4 commits into
mainfrom
test-shape-hardening
Open

test: harden assertion shape across admin + auth + security specs#64
awais786 wants to merge 4 commits into
mainfrom
test-shape-hardening

Conversation

@awais786

Copy link
Copy Markdown
Collaborator

Whole-suite review (docs/test-review-2026-06-10.md) found no Critical shape bugs but a cluster of over-loose assertions that would pass on a real-but-narrow regression. This tightens the seven highest-signal ones and closes the two genuine per-app coverage soft spots (SurfSense and Twenty admin distinctions).

  • surfsense-admin: owner positive now also asserts the role-change button is enabled (actionable), not just present — the owner/non-owner distinction was leaning on a single getByRole("button") count.
  • twenty-admin: admin positive no longer accepts a bare 2xx from /admin-panel-graphql-api as proof of admin. GraphQL answers 200 even on AdminPanelGuard denial (errors[] + null data); require a real data payload with no errors.
  • layer2-re-establish: record IDP-host hits during the reload and assert zero. Previously an app that re-established by silently re-running the full SSO/IDP bounce would land on-host and pass identically; the contract is re-establishment FROM HEADERS, no IDP bounce.
  • session-store-redis-backed: add a post-navigation split-form cookie check (the genuinely falsifiable signal); the steady-state size- equality check alone cannot grow an already-populated worker session.
  • pm-godmode: pin the CSRF endpoint to exactly 200 instead of < 400 so a 3xx into the auth chain can't pass as a clean bypass.
  • per-app-cookie-theft: for apps known to be cookie-stateful (PM, Outline, Penpot, SurfSense) fail loud when zero session cookies are found instead of skipping as "vacuously safe" — a renamed cookie was a silent coverage hole. Twenty (localStorage-token auth) still skips.

No @SPEC tags changed; spec-coverage audit clean (0 missing).

Motivation / Background

This Pull Request has been created because:

  • Resolves #issue-id

Detail

This Pull Request changes:

Additional information

TIP: Provide additional information such as screenshots, benchmarks, reference to other repositories or alternative solutions

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs
  • Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • Tests are added or updated if you fix a bug or add a feature
  • CHANGELOG files are updated for the behavior change or additional feature (minor bug fixes and documentation changes should not be included)
  • This PR contains API changes and API documentation is updated accordingly (for critical or behavior change, please inform related parties about them).

Whole-suite review (docs/test-review-2026-06-10.md) found no Critical
shape bugs but a cluster of over-loose assertions that would pass on a
real-but-narrow regression. This tightens the seven highest-signal ones
and closes the two genuine per-app coverage soft spots (SurfSense and
Twenty admin distinctions).

- surfsense-admin: owner positive now also asserts the role-change
  button is enabled (actionable), not just present — the owner/non-owner
  distinction was leaning on a single getByRole("button") count.
- twenty-admin: admin positive no longer accepts a bare 2xx from
  /admin-panel-graphql-api as proof of admin. GraphQL answers 200 even
  on AdminPanelGuard denial (errors[] + null data); require a real data
  payload with no errors.
- layer2-re-establish: record IDP-host hits during the reload and assert
  zero. Previously an app that re-established by silently re-running the
  full SSO/IDP bounce would land on-host and pass identically; the
  contract is re-establishment FROM HEADERS, no IDP bounce.
- session-store-redis-backed: add a post-navigation split-form cookie
  check (the genuinely falsifiable signal); the steady-state size-
  equality check alone cannot grow an already-populated worker session.
- pm-godmode: pin the CSRF endpoint to exactly 200 instead of < 400 so a
  3xx into the auth chain can't pass as a clean bypass.
- per-app-cookie-theft: for apps known to be cookie-stateful (PM,
  Outline, Penpot, SurfSense) fail loud when zero session cookies are
  found instead of skipping as "vacuously safe" — a renamed cookie was a
  silent coverage hole. Twenty (localStorage-token auth) still skips.

No @SPEC tags changed; spec-coverage audit clean (0 missing).
@awais786 awais786 requested a review from Copilot June 10, 2026 10:23
@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown

SSO Spec Coverage Audit

Contract source: vendor/openspec/specs/ (snapshot vendored in this repo)

Status Count
✅ Covered (test tag found) 65
⚠️ Deferred (in deferred doc) 23
❌ Missing (no tag, no defer) 0
Total requirements 88

All 88 spec requirements are accounted for.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens several high-signal Playwright assertions across app/admin/auth/security specs to avoid “passes on narrow regressions,” and adds a written suite review documenting findings and rationale.

Changes:

  • Harden per-app cookie theft/attribute specs by failing loudly when known cookie-stateful apps unexpectedly produce zero matching session cookies.
  • Strengthen auth lifecycle and session-store checks (IDP-bounce detection on layer-2 re-establish; post-navigation split-cookie detection for Redis-backed session store).
  • Tighten app-specific assertions (Twenty admin GraphQL signal requires real data and no errors; SurfSense owner role-change control must be enabled; PM CSRF endpoint pinned to HTTP 200).

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/security/per-app-cookie-theft.spec.ts Adds “known stateful apps” guard to prevent silent skips when session cookie patterns no longer match.
tests/auth/session-store-redis-backed.spec.ts Adds post-navigation split-cookie check and refines the size-stability assertion messaging.
tests/auth/layer2-re-establish.spec.ts Records and asserts zero IDP-host responses during reload to distinguish header-based re-establish from hidden re-login.
tests/apps/twenty-admin.spec.ts Treats /admin-panel-graphql-api as an admin signal only when HTTP 200 includes non-null data and no errors.
tests/apps/surfsense-admin.spec.ts Strengthens owner-positive assertion by requiring a role-change button to be enabled (actionable).
tests/apps/pm-godmode.spec.ts Pins CSRF endpoint response to exactly 200 to prevent redirects passing as bypass.
docs/test-review-2026-06-10.md Adds documented suite review capturing findings, severity, and follow-ups.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +117 to +119
// The SSO cookie value must stay byte-identical: a cookie-backed store
// would re-encode / refresh the claim payload as the per-app
// middlewares round-trip identity, changing the value length.
Awais Qureshi added 3 commits June 10, 2026 18:46
Whole-suite review (docs/test-review-2026-06-10.md) found no Critical
shape bugs but a cluster of over-loose assertions that would pass on a
real-but-narrow regression. Tighten the highest-signal ones; all changes
validated against the live sandbox (headless).

- surfsense-admin: owner positive also asserts the role-change button is
  enabled (actionable), not just present.
- twenty-admin: admin positive no longer accepts a bare 2xx from
  /admin-panel-graphql-api as proof of admin (GraphQL answers 200 even on
  AdminPanelGuard denial); require a real data payload with no errors.
- layer2-re-establish: record IDP-host hits during the reload and assert
  zero — an app that "re-establishes" by silently re-running the full SSO
  bounce would otherwise land on-host and pass identically.
- session-store-redis-backed: add a post-navigation split-form cookie
  check (the genuinely falsifiable signal); make the cross-app nav loop
  tolerant of Twenty's documented net::ERR_ABORTED via waitUntil:commit.
- pm-godmode: pin the CSRF endpoint to exactly 200 (not < 400) so a 3xx
  into the auth chain can't pass as a clean bypass.
- per-app-cookie-theft: warm each app's session cookie into the browser
  jar (PM/Outline/Penpot set it on an authenticated API call, not the
  bare page load) so the theft + attribute assertions actually RUN —
  previously they silently skipped for PM/Outline. For known-stateful
  apps, fail loud when no session cookie is found. SurfSense + Twenty are
  excluded: both authenticate off the proxy-injected identity and set no
  per-app session cookie (verified against the live sandbox).

No @SPEC tags changed; spec-coverage audit clean (0 missing).
Closes the "Partially covered" deferred gap on
session-lifecycle#layer-2-session-renewal-shall-be-guarded-against-three-regression-paths.

The renewal middleware (must NOT re-issue a session cookie on a 4xx) is a
cross-app pattern; the test previously pinned only Penpot (auth-token).
Parameterize over all three cookie-stateful apps — Penpot (auth-token),
Outline (accessToken), Plane (session-id) — each provoking a clean 4xx
through the renewal-owning middleware via an unknown RPC command / API
method / route. All three pass against the live sandbox.

Live-verified details baked in: Plane's cookie is `session-id`
(hyphenated) and is set only after an authenticated API call, so its
probe warms via /api/users/me/ first. Twenty (localStorage token) and
SurfSense (proxy-header auth, no session cookie) are out of scope.

Each app's 4xx is asserted as a pre-condition, so a non-4xx response
fails loud rather than passing vacuously. Deferred-doc note moved from
"Partially covered" to "Covered"; spec-coverage.md regenerated. No @SPEC
tags changed; audit clean.
Installs deps + Playwright browsers and seeds .env from .env.example on
first run. If npm is missing, installs Node via Homebrew when available,
otherwise prints nvm / NPM-override guidance and exits. Idempotent: skips
the .env copy when one already exists.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Comment on lines +66 to +74
// Warm the per-app session cookie INTO the browser context's jar. Unlike
// Penpot (whose SPA fires an authenticated RPC on page load), PM /
// Outline / SurfSense create their session only on an explicit
// authenticated API call and set the cookie on THAT response — a bare
// page load leaves only `_oauth2_proxy`. `context.request` shares the
// context cookie jar, so the session cookie lands where
// `context.cookies()` can see it. Without this warm, the theft +
// attribute assertions below silently skip for those three apps (the
// exact coverage hole the KNOWN_STATEFUL_APPS guard now fails loud on).
Comment on lines +125 to +127
// The SSO cookie value must stay byte-identical: a cookie-backed store
// would re-encode / refresh the claim payload as the per-app
// middlewares round-trip identity, changing the value length.
- `mPass-side session revocation SHALL be honoured on next refresh` — **Cognito-side** — requires admin API call to revoke a refresh token; not in CI scope today.
- `per-app session TTLs SHALL be uniformly configurable` — **Infra-only**.
- `Layer-2 session renewal SHALL be guarded against three regression paths` — **Partially covered** — `tests/auth/layer2-renewal-suppressed-on-4xx.spec.ts` pins the Penpot/`auth-token` case (4xx response must carry no fresh session cookie). Extending to Outline (`accessToken`) and Plane (`sessionid`) is the same shape with different per-app cookie names.
- `Layer-2 session renewal SHALL be guarded against three regression paths` — **Covered** — `tests/auth/layer2-renewal-suppressed-on-4xx.spec.ts` pins the 4xx-must-carry-no-fresh-session-cookie invariant across all three cookie-stateful apps: Penpot (`auth-token`), Outline (`accessToken`), and Plane (`session-id`), via an unknown RPC command / API method / route respectively. Twenty (localStorage token pair, no session cookie) and SurfSense (Django path mirrors Plane) are out of scope. The time-based "stale cookie past renewal threshold" axis remains a TTL-bundle gap.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants