test(perf): add Lighthouse benchmark suite and CI job by FarhanAliRaza · Pull Request #6494 · reflex-dev/reflex

FarhanAliRaza · 2026-05-12T16:09:57Z

Adds a Playwright-driven Lighthouse benchmark for integration tests, a runner script, fixtures and helpers, plus a CI job that uploads the generated report so frontend performance regressions are caught.

All Submissions:

Have you followed the guidelines stated in CONTRIBUTING.md file?
Have you checked to ensure there aren't any other open Pull Requests for the desired changed?

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)

New Feature Submission:

Does your submission pass the tests?
Have you linted your code locally prior to submission?

Changes To Core Features:

Have you added an explanation of what your changes do and why you'd like us to include them?
Have you written new tests for your core changes, as applicable?
Have you successfully ran tests with your changes locally?

Adds a Playwright-driven Lighthouse benchmark for integration tests, a runner script, fixtures and helpers, plus a CI job that uploads the generated report so frontend performance regressions are caught.

codspeed-hq · 2026-05-12T16:13:26Z

Merging this PR will not alter performance

✅ 24 untouched benchmarks
⏩ 2 skipped benchmarks¹

_{Comparing FarhanAliRaza:lighthouse-benchmark (81b8627) with main (b7838e7)}

2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

greptile-apps · 2026-05-12T16:14:06Z

Greptile Summary

This PR introduces a Playwright-driven Lighthouse benchmark suite for Reflex production apps, including shared utilities, a CI job, integration and unit tests, and a local runner script. The benchmark builds a real landing-page Reflex app in prod mode, warms up the server, and audits performance/accessibility/best-practices/SEO with a 90-point threshold for each category.

tests/integration/lighthouse_utils.py (1 088 lines) is the core of the change — it handles CLI resolution, subprocess management for reflex run --env prod, warmup polling, Lighthouse invocation, and structured result reporting.
.github/workflows/performance.yml adds a new lighthouse job that installs Chromium headless shell, runs the benchmark under REFLEX_RUN_LIGHTHOUSE=1, and uploads the JSON report as an artifact.
tests/units/test_lighthouse_utils.py provides solid unit-test coverage of command resolution, cache warm-up, timeout messages, and URL rewriting.

Confidence Score: 3/5

The benchmark infrastructure is well-structured overall, but the warmup failure path calls proc.wait(timeout=10) without catching subprocess.TimeoutExpired, which will discard the diagnostic message and leave the subprocess un-killed.

The warmup failure branch in _run_prod_lighthouse_benchmark silently swallows the pytest.fail() call whenever the reflex subprocess does not exit within 10 s — a realistic scenario under CI load. Every other proc.wait(timeout=10) call in the same function correctly wraps the call in try/except and follows it with proc.kill(), so the omission stands out and will produce misleading failures.

tests/integration/lighthouse_utils.py — specifically the else branch of the warmup loop around line 1128

Important Files Changed

Filename	Overview
tests/integration/lighthouse_utils.py	Core benchmark utility — 1 088 lines covering CLI resolution, prod-server startup, warmup, and Lighthouse invocation. Contains a missing try/except around proc.wait(timeout=10) in the warmup failure path that will swallow the diagnostic message and leave the subprocess un-killed.
.github/workflows/performance.yml	Adds a new lighthouse CI job with checkout, env setup, Playwright install, pytest run, and artifact upload — structure is sound; uses python-version 3.14 and installs chromium --only-shell which is correctly handled by get_chrome_path().
scripts/run_lighthouse.py	Local runner script — hardcodes .states directory instead of using a constant from the constants module, violating the team's directory-path convention.
tests/integration/test_lighthouse.py	Thin integration test that delegates all work to lighthouse_utils; skip guard, module-scoped fixture, and single test function look correct.
tests/units/test_lighthouse_utils.py	Unit tests cover command resolution priority, cache warm-up, timeout messages, and URL rewriting — good coverage of the utility functions.

Sequence Diagram

sequenceDiagram
    participant CI as GitHub Actions
    participant pytest as pytest runner
    participant utils as lighthouse_utils
    participant reflex as reflex run --env prod
    participant lh as Lighthouse CLI

    CI->>pytest: uv run pytest test_lighthouse.py
    pytest->>utils: run_landing_prod_lighthouse_benchmark()
    utils->>utils: _ensure_lighthouse_app() scaffold app + copy fixtures
    utils->>reflex: subprocess.Popen(reflex run --env prod --frontend-only)
    loop read stdout until URL found or deadline
        reflex-->>utils: App running at: http://0.0.0.0:3001
    end
    utils->>utils: _get_lighthouse_target_url() rewrite 0.0.0.0 to 127.0.0.1
    loop warmup until server responds or 30 s
        utils->>reflex: urllib.request.urlopen(http://127.0.0.1:3001)
    end
    utils->>utils: get_lighthouse_command() + get_chrome_path()
    utils->>lh: "subprocess.run(lighthouse http://127.0.0.1:3001 --output=json ...)"
    lh-->>utils: landing-prod-lighthouse.json
    utils->>reflex: proc.terminate() + proc.wait()
    utils-->>pytest: LighthouseBenchmarkResult(report, summary, failures)
    pytest->>CI: upload artifact lighthouse-report/

Comments Outside Diff (1)

tests/integration/lighthouse_utils.py, line 1128-1134 (link)

Unguarded proc.wait() swallows the diagnostic message

In the else branch of the warmup loop, proc.wait(timeout=10) is called without a try/except subprocess.TimeoutExpired block. If the reflex subprocess is still alive after 10 s, subprocess.TimeoutExpired propagates up and the subsequent pytest.fail() call is never reached — so the actual diagnostic ("Warmup request … never succeeded") is silently discarded and the test fails with a confusing timeout traceback. Every other proc.wait(timeout=10) call in this function wraps the call in try/except followed by proc.kill(), so the omission here is inconsistent and leaves resources un-killed.

_{Reviews (1): Last reviewed commit: "test(perf): add Lighthouse benchmark sui..." | Re-trigger Greptile}

Detect chrome-headless-shell and skip the --headless=new flag it rejects, route the scratch dir through reflex_base constants, exclude the *current symlinks from the uploaded artifact, and temporarily lower the performance threshold to 0.56 while we tune the CI baseline.

Revert the headless-shell detection now that the runner ships full Chrome, and remove the unused app_name parameter from _run_prod_lighthouse_benchmark.

test(perf): add Lighthouse benchmark suite and CI job

f195657

Adds a Playwright-driven Lighthouse benchmark for integration tests, a runner script, fixtures and helpers, plus a CI job that uploads the generated report so frontend performance regressions are caught.

FarhanAliRaza requested a review from a team as a code owner May 12, 2026 16:09

greptile-apps Bot reviewed May 12, 2026

View reviewed changes

Comment thread scripts/run_lighthouse.py Outdated

Comment thread tests/integration/lighthouse_utils.py

Comment thread tests/integration/lighthouse_utils.py

FarhanAliRaza added 2 commits May 12, 2026 22:39

refactor(lighthouse): drop chrome-headless-shell branch and unused arg

81b8627

Revert the headless-shell detection now that the runner ships full Chrome, and remove the unused app_name parameter from _run_prod_lighthouse_benchmark.

masenf approved these changes May 12, 2026

View reviewed changes

masenf merged commit d76ca30 into reflex-dev:main May 12, 2026
70 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(perf): add Lighthouse benchmark suite and CI job#6494

test(perf): add Lighthouse benchmark suite and CI job#6494
masenf merged 3 commits into
reflex-dev:mainfrom
FarhanAliRaza:lighthouse-benchmark

FarhanAliRaza commented May 12, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented May 12, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 12, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FarhanAliRaza commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

All Submissions:

Type of change

New Feature Submission:

Changes To Core Features:

Uh oh!

codspeed-hq Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

greptile-apps Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FarhanAliRaza commented May 12, 2026 •

edited

Loading

codspeed-hq Bot commented May 12, 2026 •

edited

Loading

greptile-apps Bot commented May 12, 2026 •

edited

Loading