End-to-end tests for the FOSS platform. See
docs/spec-coverage.md for the
requirement-by-requirement audit table. Highlights: SSO chain,
multi-app session sharing, cookie expiry bounds, session lifecycle
(logout / invalidation / replay / deletion), per-app link coverage, the
Plane god-mode admin escape hatch, Outline's admin /settings/* SSO-gating
- role split, Twenty's
/settings/admin-panelURL gate, the full login → 5 apps → logout user journey, and the SSO-rule invariants from the vendored openspec (header spoofing, bypass discipline, security-header coverage on every router type, no local-login UI in SSO mode, cross-app identity consistency, HTTP plaintext lockdown).
The suite is environment-agnostic. One env var (FOSS_BASE_URL) drives
the entire host topology. Pointing at sandbox, staging, prod, or a local
devstack is a one-line .env change — no code edits.
Every contract-bearing test in the suite points at a written requirement. The audit is bidirectional and gated by CI:
scripts/check-spec-coverage.shwalks requirements → tests: every### Requirement:line invendor/openspec/must be tagged by at least one test, OR documented indocs/spec-coverage-deferred.md.- The structural check in
tests/meta/playwright-practices.spec.tswalks tests → requirements: every contract-bearing*.spec.tsmust carry at least one// @spec module#slugtag pointing at a vendored requirement (small documented allowlist for shells / scan drivers).
Adding a test without a requirement → CI fails. Adding a requirement without a test → CI fails. Drift requires explicit human acceptance in both directions.
The audit gates presence, not shape. Whether a test's assertion
actually fails when the contract breaks is a human judgement call. Use
the Spec & Test Shape Review Checklist
before opening any PR that edits a ### Requirement: block or touches
an @spec-tagged test.
Today: 88 requirements, 0 missing — 52 from the SSO chain openspec,
36 from per-app admin + workspace-isolation + security-hardening skills.
Run make audit for live counts.
When CI is red, TRIAGE.md is the 2-minute "failure
pattern → cause → action" runbook.
All hosts derive from FOSS_BASE_URL.
Default topology is nested (FOSS_HOST_TOPOLOGY unset):
| Component | Host pattern | Sandbox value |
|---|---|---|
| Main portal | foss.<domain> |
foss.arbisoft.com |
| Outline (Docs) | docs.foss.<domain> |
docs.foss.arbisoft.com |
| Plane (PM) | pm.foss.<domain> |
pm.foss.arbisoft.com |
| Penpot (Design) | design.foss.<domain> |
design.foss.arbisoft.com |
| SurfSense (Research) | research.foss.<domain> |
research.foss.arbisoft.com |
| Twenty (CRM) | twenty.foss.<domain> |
twenty.foss.arbisoft.com |
| ForwardAuth proxy | auth.foss.<domain> |
auth.foss.arbisoft.com |
In peer topology (FOSS_HOST_TOPOLOGY=peer), app/auth hosts derive as
<app>.<smb-domain> (for example docs.platform.askii.ai) while
FOSS_BASE_URL stays https://foss.<smb-domain>. Cookie scope is the
SMB domain (<smb-domain>), so one cookie still covers all app hosts.
npm install
npm run install:browsers
cp .env.example .env # then fill in FOSS_USER / FOSS_PASS
npm testRequired env (in .env):
FOSS_BASE_URL=https://foss.arbisoft.com
FOSS_USER=...
FOSS_PASS=...
Optional:
NORMAL_USER/NORMAL_PASS— separate SSO identity ("User B" per sso-rules/admin.md) that has NOT been promoted to admin on any app. Required for the non-admin half of the role-split tests (outline-admin,twenty-admin); those blocks self-skip otherwisePLANE_ADMIN_USER/PLANE_ADMIN_PASS— enables the god-mode admin sign-in- wrong-password tests (otherwise those self-skip). These are local Plane credentials, NOT SSO — Plane's god-mode bypasses oauth2-proxy entirely.
TWENTY_ADMIN_USER/TWENTY_ADMIN_PASS— optional override for Twenty's positive admin-panel test. By default that test usesFOSS_USER/FOSS_PASS; set this pair only when Twenty admin rights live on a different identity in a given deployment.
Outline / Penpot / SurfSense / Twenty admin tests use the worker
fixture's identity (FOSS_USER) by default. Plane is intentionally
different: shared-workspace role checks in tests/apps/pm-admin.spec.ts
pin Member-only behavior unless that identity is explicitly promoted in
the target deployment.
BROWSERS=all— chromium + firefox + webkit (default: chromium only)FOSS_COGNITO_DOMAIN/FOSS_MPASS_DOMAIN— IDP overrides (don't derive from base URL)FOSS_HOST_TOPOLOGY— host derivation mode:nested(default) orpeer
See .env.example for everything.
npm test # all tests, chromium
npm run test:auth # tests/auth/ — SSO, sharing, lifecycle, identity
npm run test:apps # tests/apps/ — per-app + god-mode
npm run test:flows # tests/flows/ — login → 5 apps → logout
npm run test:security # tests/security/ — headers, plaintext, spoof, bypass
npm run test:all-browsers # full suite × chromium + firefox + webkit
npm run report # open last HTML reportMakefile shortcuts:
make help # list all targets
# Bulk runs (headless, same as CI)
make test # full suite, chromium
make test-auth # tests/auth/
make test-apps # tests/apps/
make test-flows # tests/flows/
make test-security # tests/security/
make test-all-browsers # chromium + firefox + webkitRunning one test locally — these three convenience targets default
to visible Chrome (headed) so you can watch the browser; pass
HEADED=0 to force headless. CI auto-detects $CI and runs headless
without needing the override.
| Target | Variable | When to use |
|---|---|---|
test-spec |
SPEC= (full path) |
You're pasting a file path |
test-one |
NAME= (filename substring) |
"Run the outline-admin tests" |
test-name |
NAME= (test name) |
Paste the text after › from a CI failure line |
# Run a whole file (39 outline-admin tests)
make test-spec SPEC=tests/apps/outline-admin.spec.ts
make test-one NAME=outline-admin
# Run ONE test (by name — works across the whole suite)
make test-name NAME="admin reaches /settings/integrations"
# Narrow inside a file with --grep
make test-one NAME=outline-admin GREP="/settings/integrations"
# Force headless (rare locally; default in CI)
make test-name NAME="..." HEADED=0Every test.skip(...) / raw.skip(...) in this suite is intentional —
not a TODO or broken test. Four categories:
| Reason | Where | Why |
|---|---|---|
| Missing env credentials | tests/apps/*-admin.spec.ts, tests/apps/pm-godmode.spec.ts, tests/flows/identity-switch-after-relogin.spec.ts, tests/bugs/bug_dc998ba0.spec.ts |
Tests that need a second identity (NORMAL_USER) or local Plane creds (PLANE_ADMIN_USER) self-skip when unset, so the suite is portable to deployments without those identities provisioned |
| Vacuously-satisfied invariant | tests/auth/proxy-short-circuit.spec.ts, tests/auth/logout-invariants.spec.ts |
The deployment doesn't expose the surface the test checks (e.g. no JS-readable session cookie to rotate) — the contract is satisfied trivially |
| App-shape opt-out | tests/lib/link-coverage.ts |
Apps with no <a href> nav (button-only SPAs) skip link-crawl assertions via requireLinks=false |
| Browser-matrix dedup | tests/meta/playwright-practices.spec.ts |
The static hygiene grep is browser-independent — runs once on Chromium, dedups on Firefox/WebKit |
The full invariant contract lives in skills.md (universal rules) and
the vendored openspec
(per-module spec.md files: forwardauth-traefik, oauth2-proxy-gateway,
proxy-auth-middleware, session-lifecycle, logout-flow,
cognito-claim-mapping, workspace-auto-join). Highlights:
- SSO chain —
_oauth2_proxycookie shape (Secure / HttpOnly / SameSite=Lax), shared across all 5 subdomains, present after login, scoped to the platform domain. - Cookie expiry —
_oauth2_proxyand per-app session cookies must expire within the configured SSO TTL bound (FOSS_MAX_SESSION_TTL_SECONDS, default 92 days). Browser-session cookies (expires=-1) and CSRF / locale cookies are excluded. - Session lifecycle — UI logout clears the cookie; logout from one app invalidates all; pre-logout cookies cannot be replayed; deleting the SSO cookie locks every app behind the IDP.
- Layer-2 re-establish across apps — valid SSO cookie + cleared app-local session/storage + reload stays on the app host (no IDP bounce).
- Cross-app identity consistency — every backend's
/me-shape endpoint (Plane/api/users/me/, Outline/api/auth.info, Penpot RPCget-profile, SurfSense/users/me) resolves the same logged-in user to the same synthesized email. CatchesDEFAULT_EMAIL_DOMAINdrift across containers.
- Per-app link coverage — every internal
<a href>on the start page loads <400, stays on the app's host, doesn't bounce to the auth wall, no 404 in title, and is clickable. Adapts to SPAs whose nav streams in / mutates by route (Twenty). - god-mode (Plane admin) —
/god-mode/and/auth/get-csrf-tokenbypass ForwardAuth; the page renders Plane's own admin form (not the SSO IDP); admin login works; wrong password is rejected; admin login does not issue the platform_oauth2_proxycookie (separate session universe). - Outline admin (
/settings/*) — inverse invariant of god-mode: every admin URL sits fully behind SSO (cold context bounces through ForwardAuth), and Outline enforces the admin/non-admin role split server-side. Under a non-admin SSO user the 5 common-settings pages load, while the 8 admin-only pages (details, security, authentication, features, integrations, applications, import, export) are gated (Not Found, chunk-load failure, or never resolve past the SPA shell). - Twenty admin (
/settings/admin-panel) — Twenty has a real admin URL gated server-side byAdminPanelGuardcheckingUser.canAccessFullAdminPanel. Three invariants: cold context bounces through SSO (no bypass), FOSS_USER (non-admin) lands on Twenty but the page renders no admin-panel UI markers, and (whenTWENTY_ADMIN_USERis set) a pre-bootstrapped admin user sees the admin UI markers (Health Status / Feature Flags / Config Variables / AI Models). Twenty has NO first-user-auto-admin —canAccessFull AdminPanelis only flippable via theworkspace:bootstrap-sso-adminCLI command, so the admin account must be pre-bootstrapped on the deployment. - End-to-end flow — fresh login → all 5 apps load authed; per-app
/oauth2/sign_out; main portal "Log out of all apps" → all 5 apps bounce back to the IDP.
- HTTP plaintext lockdown — every host on port 80 redirects to https or refuses the connection. No 2xx ever served over plain HTTP.
- Security headers on every browser-facing router — HSTS (≥180d +
includeSubDomains),
X-Content-Type-Options: nosniff,X-Frame-Options: DENY|SAMEORIGIN,Referrer-Policy,Permissions-Policydenying camera / mic / geolocation. Covers*-secure,*-bypass,oauth2-proxy-secure, andoauth2-appsrouters (headers are per-response, not host-cached — each router type is verified separately). - Header spoofing rejection — sending
X-Auth-Request-*headers without a cookie must bounce to auth or 4xx; thestrip-auth-headersmiddleware must scrub inbound identity headers before the backend can trust them. Verified on both*-secureand*-bypassrouters (defense-in-depth on bypass routers added infoss-server-bundle#30). - Bypass discipline — static assets (
/favicon.ico,/robots.txt) reachable without auth; the catch-all/still gated. Catches both over-protection of public assets and under-protection of the secure catch-all (Electric/v1/shapeexfiltration pattern). - AUTH_TYPE=SSO gate — local login / register / forgot-password
UI must be hidden in SSO mode. Every app's known local-auth routes
(Plane
/sign-in, Outline/auth/email, Penpot/#/auth/*, SurfSense/login, Twenty/sign-in, etc.) must have no reachable<input type="password">.
tests/
├── auth/
│ ├── sso-login.spec.ts # cookie shape, persistence
│ ├── session-sharing.spec.ts # cross-subdomain scope + expiry bounds
│ ├── session-lifecycle.spec.ts # logout, replay, cookie deletion
│ ├── email-domain-consistency.spec.ts # DEFAULT_EMAIL_DOMAIN parity check
│ ├── identity-consistency.spec.ts # every backend resolves the same email
│ └── workspace-auto-join-independence.spec.ts # app-local workspace state isolation
├── apps/
│ ├── outline.spec.ts # branding + link coverage
│ ├── penpot.spec.ts # branding + hash-route nav coverage
│ ├── outline-admin.spec.ts # /settings/* SSO-gating + non-admin role split
│ ├── pm.spec.ts # link coverage
│ ├── pm-godmode.spec.ts # admin escape-hatch invariants
│ ├── surfsense.spec.ts # link coverage
│ ├── twenty.spec.ts # link coverage (SPA, route-mutating nav)
│ └── twenty-admin.spec.ts # /settings/admin-panel URL gate
├── flows/
│ └── login-logout-flow.spec.ts # full e2e journey
├── security/
│ ├── headers.spec.ts # canonical headers on *-secure, *-bypass,
│ │ # oauth2-proxy-secure, oauth2-apps
│ ├── http-no-plaintext.spec.ts # no 2xx over plain HTTP on any host
│ ├── header-spoofing.spec.ts # X-Auth-Request-* spoof must be rejected
│ ├── strip-on-bypass.spec.ts # strip-auth-headers chained on bypass routers
│ ├── bypass-surface.spec.ts # static assets bypass, catch-all gated
│ └── sso-mode-no-local-login.spec.ts # no password input on local-auth routes
└── lib/
└── link-coverage.ts # registerLinkCoverage() factory
constants.ts — single source of truth, derives every host from
FOSS_BASE_URL.
skills.md — local invariant contract (what every app must satisfy).
vendored openspec — canonical edge-layer +
per-app rules organised by capability spec; the tests/security/ and
identity-consistency tests verify these on the live deployment.
Login is performed once per worker via the configured IDP (Cognito or mPass). The resulting cookies + storage are shared across tests in the same worker — no repeated logins.
Session lifecycle and god-mode tests are exempt: they spawn fresh contexts and manage their own login/logout so global state changes don't contaminate the shared session.
# .env
FOSS_BASE_URL=https://foss.example.com
FOSS_USER=prod-user@example.com
FOSS_PASS=prod-password
# only if prod uses a different mPass IDP host:
# FOSS_MPASS_DOMAIN=moneta-auth.pressingly.netApp URLs, ForwardAuth host, and cookie domain all auto-derive from
FOSS_BASE_URL.
Two workflows. Each runs the full chromium suite and uploads the HTML report as an artifact every run; failure traces and videos as a second artifact only on failure.
| Workflow | File | Triggers | Secrets prefix |
|---|---|---|---|
E2E — Sandbox |
.github/workflows/e2e-sandbox.yml |
every 12h (00:00 + 12:00 UTC), push to main, PR, manual |
SANDBOX_* |
E2E — Production |
.github/workflows/e2e-prod.yml |
manual only — gated on production Environment |
PROD_* |
Note: scheduled (cron) runs only fire from the default branch, so the 12-hour cadence starts after this branch lands on
main.
Repo → Settings → Secrets and variables → Actions → Secrets:
| Name | Required | Purpose |
|---|---|---|
SANDBOX_FOSS_USER |
✅ | SSO username (User A — pre-promoted admin per sso-rules/admin.md) |
SANDBOX_FOSS_PASS |
✅ | SSO password |
SANDBOX_NORMAL_USER |
optional | User B SSO username — enables non-admin role-split tests |
SANDBOX_NORMAL_PASS |
optional | same |
SANDBOX_PLANE_ADMIN_USER |
optional | enables god-mode admin tests (local Plane creds, not SSO) |
SANDBOX_PLANE_ADMIN_PASS |
optional | same |
SLACK_WEBHOOK_URL |
optional | enables Slack failure notifications (with the list of failed tests) |
Variables tab (optional):
| Name | Purpose |
|---|---|
SANDBOX_FOSS_BASE_URL |
override sandbox URL (default https://foss.arbisoft.com) |
Repo → Settings → Environments → New environment → production:
- Enable Required reviewers so prod runs pause for human approval.
- Add the secrets below as Environment secrets (not repo secrets — Environment secrets only release when an approver clicks):
| Name | Required | Purpose |
|---|---|---|
PROD_FOSS_USER |
✅ | SSO username (User A — pre-promoted admin) |
PROD_FOSS_PASS |
✅ | SSO password |
PROD_NORMAL_USER |
optional | User B SSO username — enables non-admin role-split tests |
PROD_NORMAL_PASS |
optional | same |
PROD_PLANE_ADMIN_USER |
optional | god-mode admin user (local Plane creds, not SSO) |
PROD_PLANE_ADMIN_PASS |
optional | god-mode admin pass |
Variables (repo or environment):
| Name | Required | Purpose |
|---|---|---|
PROD_FOSS_BASE_URL |
✅ | prod main portal URL |
PROD_FOSS_MPASS_DOMAIN |
optional | prod mPass host if non-sandbox |
PROD_FOSS_COGNITO_DOMAIN |
optional | prod Cognito host if non-default |
- Sandbox: runs automatically (cron / push / PR), or Actions → E2E — Sandbox → Run workflow to trigger ad-hoc.
- Production: Actions → E2E — Production → Run workflow. Optional
base_urlinput overridesPROD_FOSS_BASE_URLfor that single run. Pauses for reviewer approval if theproductionenvironment requires it. - Daily prod smoke: uncomment the
schedule:block in.github/workflows/e2e-prod.yml.
The sandbox workflow posts a failure-only plain-text report to Slack — listing each failed test by file + title — when any run (12h cron, push, PR, manual) fails. Successful runs stay quiet.
Sample message:
E2E Sandbox failed — 2 test(s)
Branch: main @ a1b2c3d
https://github.com/your-org/your-repo/actions/runs/123456789
tests/auth/session-sharing.spec.ts: auth/session cookies on every app...
tests/apps/outline.spec.ts: clicking each visible link navigates within host
To enable:
- Slack workspace → Apps → Incoming Webhooks → New webhook for the target channel; copy the URL.
- Repo Settings → Secrets and variables → Actions → Secrets → add
SLACK_WEBHOOK_URLwith the webhook URL.
The notification step no-ops with a log message if the secret is unset, so existing setups don't break.
This repo now supports a Slack-driven run workflow:
- Workflow:
.github/workflows/e2e-slack-command.yml - Trigger:
workflow_dispatch(manual or API) - Input command format:
e2e <target>
Supported targets:
alloutline,penpot,surfsense,twenty,pm(orplane)
Examples:
e2e alle2e outlinee2e penpote2e surfsensee2e twentye2e pme2e plane
Your existing SLACK_WEBHOOK_URL can still post results. For triggering,
configure Slack to call GitHub's workflow-dispatch API.
Workflow-dispatch request shape:
{
"ref": "main",
"inputs": {
"command": "e2e penpot",
"slack_user_id": "U123456",
"slack_channel_id": "C123456"
}
}API endpoint:
POST /repos/<owner>/<repo>/actions/workflows/e2e-slack-command.yml/dispatches
Required auth:
- GitHub token with permission to dispatch workflows for the repo.
Optional guardrails (repo variables):
SLACK_ALLOWED_USER_IDS(comma-separated Slack user IDs)SLACK_ALLOWED_CHANNEL_IDS(comma-separated Slack channel IDs)
If allowlists are set, non-matching Slack users/channels are rejected.
CI=true npm testCI mode: 2 workers, 1 retry, HTML report saved but not opened.
Screenshots, videos, and traces written to test-results/ on failure.
tests/auth/session-sharing.spec.ts uses a deployment-configurable upper bound via
FOSS_MAX_SESSION_TTL_SECONDS (default: 92 days) so environments with intentional
multi-month cookie TTLs can be validated without hardcoding a 30-day ceiling.