Skip to content

test(perf): add Lighthouse benchmark suite and CI job#6494

Merged
masenf merged 3 commits into
reflex-dev:mainfrom
FarhanAliRaza:lighthouse-benchmark
May 12, 2026
Merged

test(perf): add Lighthouse benchmark suite and CI job#6494
masenf merged 3 commits into
reflex-dev:mainfrom
FarhanAliRaza:lighthouse-benchmark

Conversation

@FarhanAliRaza
Copy link
Copy Markdown
Contributor

@FarhanAliRaza FarhanAliRaza commented May 12, 2026

Adds a Playwright-driven Lighthouse benchmark for integration tests, a runner script, fixtures and helpers, plus a CI job that uploads the generated report so frontend performance regressions are caught.

All Submissions:

  • Have you followed the guidelines stated in CONTRIBUTING.md file?
  • Have you checked to ensure there aren't any other open Pull Requests for the desired changed?

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

New Feature Submission:

  • Does your submission pass the tests?
  • Have you linted your code locally prior to submission?

Changes To Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

Adds a Playwright-driven Lighthouse benchmark for integration tests, a
runner script, fixtures and helpers, plus a CI job that uploads the
generated report so frontend performance regressions are caught.
@FarhanAliRaza FarhanAliRaza requested a review from a team as a code owner May 12, 2026 16:09
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 12, 2026

Merging this PR will not alter performance

✅ 24 untouched benchmarks
⏩ 2 skipped benchmarks1


Comparing FarhanAliRaza:lighthouse-benchmark (81b8627) with main (b7838e7)

Open in CodSpeed

Footnotes

  1. 2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 12, 2026

Greptile Summary

This PR introduces a Playwright-driven Lighthouse benchmark suite for Reflex production apps, including shared utilities, a CI job, integration and unit tests, and a local runner script. The benchmark builds a real landing-page Reflex app in prod mode, warms up the server, and audits performance/accessibility/best-practices/SEO with a 90-point threshold for each category.

  • tests/integration/lighthouse_utils.py (1 088 lines) is the core of the change — it handles CLI resolution, subprocess management for reflex run --env prod, warmup polling, Lighthouse invocation, and structured result reporting.
  • .github/workflows/performance.yml adds a new lighthouse job that installs Chromium headless shell, runs the benchmark under REFLEX_RUN_LIGHTHOUSE=1, and uploads the JSON report as an artifact.
  • tests/units/test_lighthouse_utils.py provides solid unit-test coverage of command resolution, cache warm-up, timeout messages, and URL rewriting.

Confidence Score: 3/5

The benchmark infrastructure is well-structured overall, but the warmup failure path calls proc.wait(timeout=10) without catching subprocess.TimeoutExpired, which will discard the diagnostic message and leave the subprocess un-killed.

The warmup failure branch in _run_prod_lighthouse_benchmark silently swallows the pytest.fail() call whenever the reflex subprocess does not exit within 10 s — a realistic scenario under CI load. Every other proc.wait(timeout=10) call in the same function correctly wraps the call in try/except and follows it with proc.kill(), so the omission stands out and will produce misleading failures.

tests/integration/lighthouse_utils.py — specifically the else branch of the warmup loop around line 1128

Important Files Changed

Filename Overview
tests/integration/lighthouse_utils.py Core benchmark utility — 1 088 lines covering CLI resolution, prod-server startup, warmup, and Lighthouse invocation. Contains a missing try/except around proc.wait(timeout=10) in the warmup failure path that will swallow the diagnostic message and leave the subprocess un-killed.
.github/workflows/performance.yml Adds a new lighthouse CI job with checkout, env setup, Playwright install, pytest run, and artifact upload — structure is sound; uses python-version 3.14 and installs chromium --only-shell which is correctly handled by get_chrome_path().
scripts/run_lighthouse.py Local runner script — hardcodes .states directory instead of using a constant from the constants module, violating the team's directory-path convention.
tests/integration/test_lighthouse.py Thin integration test that delegates all work to lighthouse_utils; skip guard, module-scoped fixture, and single test function look correct.
tests/units/test_lighthouse_utils.py Unit tests cover command resolution priority, cache warm-up, timeout messages, and URL rewriting — good coverage of the utility functions.

Sequence Diagram

sequenceDiagram
    participant CI as GitHub Actions
    participant pytest as pytest runner
    participant utils as lighthouse_utils
    participant reflex as reflex run --env prod
    participant lh as Lighthouse CLI

    CI->>pytest: uv run pytest test_lighthouse.py
    pytest->>utils: run_landing_prod_lighthouse_benchmark()
    utils->>utils: _ensure_lighthouse_app() scaffold app + copy fixtures
    utils->>reflex: subprocess.Popen(reflex run --env prod --frontend-only)
    loop read stdout until URL found or deadline
        reflex-->>utils: App running at: http://0.0.0.0:3001
    end
    utils->>utils: _get_lighthouse_target_url() rewrite 0.0.0.0 to 127.0.0.1
    loop warmup until server responds or 30 s
        utils->>reflex: urllib.request.urlopen(http://127.0.0.1:3001)
    end
    utils->>utils: get_lighthouse_command() + get_chrome_path()
    utils->>lh: "subprocess.run(lighthouse http://127.0.0.1:3001 --output=json ...)"
    lh-->>utils: landing-prod-lighthouse.json
    utils->>reflex: proc.terminate() + proc.wait()
    utils-->>pytest: LighthouseBenchmarkResult(report, summary, failures)
    pytest->>CI: upload artifact lighthouse-report/
Loading

Comments Outside Diff (1)

  1. tests/integration/lighthouse_utils.py, line 1128-1134 (link)

    P1 Unguarded proc.wait() swallows the diagnostic message

    In the else branch of the warmup loop, proc.wait(timeout=10) is called without a try/except subprocess.TimeoutExpired block. If the reflex subprocess is still alive after 10 s, subprocess.TimeoutExpired propagates up and the subsequent pytest.fail() call is never reached — so the actual diagnostic ("Warmup request … never succeeded") is silently discarded and the test fails with a confusing timeout traceback. Every other proc.wait(timeout=10) call in this function wraps the call in try/except followed by proc.kill(), so the omission here is inconsistent and leaves resources un-killed.

Reviews (1): Last reviewed commit: "test(perf): add Lighthouse benchmark sui..." | Re-trigger Greptile

Comment thread scripts/run_lighthouse.py Outdated
Comment thread tests/integration/lighthouse_utils.py
Comment thread tests/integration/lighthouse_utils.py
Detect chrome-headless-shell and skip the --headless=new flag it
rejects, route the scratch dir through reflex_base constants, exclude
the *current symlinks from the uploaded artifact, and temporarily lower
the performance threshold to 0.56 while we tune the CI baseline.
Revert the headless-shell detection now that the runner ships full
Chrome, and remove the unused app_name parameter from
_run_prod_lighthouse_benchmark.
@masenf masenf merged commit d76ca30 into reflex-dev:main May 12, 2026
70 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants