Skip to content

feat: PostHog tracking for Capgo Builder onboarding + build lifecycle#2287

Merged
WcaleNieWolny merged 26 commits into
mainfrom
feat/builder-tracking-posthog
May 21, 2026
Merged

feat: PostHog tracking for Capgo Builder onboarding + build lifecycle#2287
WcaleNieWolny merged 26 commits into
mainfrom
feat/builder-tracking-posthog

Conversation

@WcaleNieWolny
Copy link
Copy Markdown
Contributor

@WcaleNieWolny WcaleNieWolny commented May 18, 2026

Summary

Adds PostHog tracking (via the existing sendEventToTracking dual-writer) for two event families in the Capgo Builder flow.

1. Per-step CLI onboarding tracking

  • New event Builder Onboarding Step fired per wizard step transition in both iOS (cli/src/build/onboarding/ui/app.tsx) and Android (cli/src/build/onboarding/android/ui/app.tsx) onboarding wizards.
  • Routed through the existing /private/events endpoint via the existing sendEvent() helper — no new backend endpoint.
  • Carries step, platform, app_id, optional duration_ms, optional error_category.
  • error_category is a closed enum (iOS: apple_api_unauthorized, apple_api_rate_limited, cert_limit_reached, profile_creation_failed, p8_invalid, unknown; Android: keystore_invalid, google_oauth_failed, play_account_id_invalid, unknown). Raw error messages never leave the CLI.
  • New helper cli/src/build/onboarding/telemetry.ts honors CAPGO_DISABLE_TELEMETRY / CAPGO_DISABLE_POSTHOG.

2. Server-side build lifecycle tracking

  • Build Requested fires from public/build/request.ts after the build_requests row is inserted.
  • Build Started, Build Succeeded, Build Failed, Build Timed Out fire from triggers/cron_reconcile_build_status.ts on status transitions.
  • Transition detection uses a new pure helper classifyBuildTransition in utils/build_tracking.ts. Idempotency is enforced by skipping when the previous status is already terminal — reuses the canonical TERMINAL_BUILD_STATUSES set from utils/build_timeout.ts.
  • failure_category is a closed enum (timeout, builder_error, validation_error, unknown). Raw error text never reaches PostHog.

Out of scope: the capgo_builder repo is not modified. Build Started is derived from the existing reconciliation cron diff, not a new builder push.

What's not changed

  • No new backend endpoint — reuses /private/events.
  • No new front-end pages, no schema migrations, no new env vars (other than honoring the existing CAPGO_DISABLE_TELEMETRY / CAPGO_DISABLE_POSTHOG).
  • The cron reuses default background-task semantics (waitUntil); no change to its completion model.

Test plan

  • bun run cli:check passes
  • New unit tests pass: error mappers (11), CLI telemetry helper (8), build transition + failure-category helpers (12) — 31 cases total
  • Adjacent existing tests still pass: tests/tracking.unit.test.ts, tests/posthog.unit.test.ts, tests/builder-payload.unit.test.ts, tests/build-timeout.unit.test.ts
  • Manual: walk through npx @capgo/cli build init --platform=ios two steps in staging; confirm Builder Onboarding Step events arrive in PostHog with the expected step, platform, app_id, and an org groups association
  • Manual: trigger one cloud build via npx @capgo/cli build request against staging; confirm Build RequestedBuild StartedBuild Succeeded/Build Failed arrive in PostHog with duration_seconds on terminal events and failure_category on failures
  • Confirm by sampling PostHog events that no error_category or failure_category value is anything other than the closed-enum members, and no raw error strings or file paths appear in any property

Documentation

Spec: docs/superpowers/specs/2026-05-18-capgo-builder-posthog-tracking-design.md
Plan: docs/superpowers/plans/2026-05-18-capgo-builder-posthog-tracking.md

Known limitations called out by reviewers but kept as-is

  • A CLI retry on a transient network error can produce a duplicate Build Requested event because build_requests has no idempotency key. Pre-existing behavior; the spec acknowledges this.
  • The cron emission uses sendEventToTracking's default background-task mode (the established convention across every other call site). Switching to foreground would serialize the cron loop and is left for a separate discussion.

Summary by CodeRabbit

  • New Features

    • Expanded telemetry: onboarding (iOS/Android), build uploads, AI analysis, build-request/start/status flows, and cron reconciliation
    • Introduced structured error/failure categories and mapping to standardize telemetry
  • Documentation

    • Added Builder telemetry design specification
  • Tests

    • Added comprehensive unit tests for telemetry, onboarding error mapping, upload tracking, and build lifecycle emissions

Review Change Stack

Adds the design doc covering two event families:

- Builder Onboarding Step (per-wizard-step, fired CLI -> backend
  -> sendEventToTracking) with closed-enum error categories
- Build lifecycle (Requested / Started / Succeeded / Failed /
  Timed Out) fired entirely server-side from the existing
  request handler and reconciliation cron

No changes to the capgo_builder repo; build_started is derived
from the existing builder polling.
The existing /private/events handler already implements auth,
org resolution, app_id permission check, and sendEventToTracking
with org grouping. Adding a second endpoint would duplicate
~80 lines of working code. CLI posts directly with the new
event name.
10 bite-sized TDD tasks covering iOS + Android error category
mappers, the CLI telemetry helper, useEffect wiring in both
wizards, pure build-transition / failure-category helpers,
Build Requested emission in request.ts, transition events from
cron_reconcile_build_status, full verification, and PR creation.

Reuses /private/events; no new backend endpoint. capgo_builder
repo is not touched.
The wizards previously stored only err.message in React state, then
reconstructed `new Error(message)` to pass to the category mapper —
losing the .status / .phase / instanceof discriminators the mapper
relies on. Capture the mapped category at handleError time via a
ref and pass it directly to the telemetry helper, bypassing the
mapper for wizard-emitted events. Tests, the helper interface, and
both wizards are updated.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds end-to-end telemetry: onboarding step tracking (iOS & Android) with closed error categories, CLI upload-phase telemetry and failure mapping, server-side build lifecycle transition events, safe-send telemetry helpers, unit tests, and a design spec.

Changes

Builder Telemetry Implementation

Layer / File(s) Summary
Error Category Types and Mappers
cli/src/build/onboarding/types.ts, cli/src/build/onboarding/error-categories.ts, tests/onboarding-error-categories.unit.test.ts
Adds OnboardingErrorCategory and platform mappers (mapIosOnboardingError, mapAndroidOnboardingError) that prioritize instance checks, then status/phase, defaulting to unknown; tests validate mappings and fallbacks.
Onboarding Telemetry Core Module & Tests
cli/src/build/onboarding/telemetry.ts, tests/builder-onboarding-telemetry.unit.test.ts
Implements TrackBuilderOnboardingStepInput and trackBuilderOnboardingStep building tags (step/platform/app_id, optional rounded duration_ms, and closed error_category via explicit input or platform mapper) and swallows send errors; tests assert payloads and mapping behavior.
iOS Onboarding UI Integration
cli/src/build/onboarding/ui/app.tsx
Wires telemetry: resolves/caches API key and Supabase org-id once, captures per-step start time, emits per-step events with computed duration, stores mapped error category for error-step telemetry, and clears it on success/retry/restart.
Android Onboarding UI Integration
cli/src/build/onboarding/android/ui/app.tsx, cli/src/build/onboarding/android/types.ts
Android UI mirrors iOS: resolves API key/org-id, emits per-step telemetry with duration, imports and uses AndroidOnboardingErrorCategory, maps errors for the error step, and clears stored category on retry/restart.
Upload Telemetry Module & Tests
cli/src/build/telemetry.ts, tests/builder-upload-telemetry.unit.test.ts
Adds upload tracking types, mapBuilderUploadError (status → closed failure categories), trackBuilderUpload that builds tags (app/platform/build_mode/jobId/size, optional rounded seconds and failure_category) and swallows send errors; tests validate mappings and payloads across phases.
Upload Integration in Build Request
cli/src/build/request.ts
Integrates upload telemetry: records uploadStartedAt, emits started/failed/succeeded upload events with metadata (size, jobId, buildMode, duration, error/failure category), and adds AI-analysis telemetry hooks (choice/result/skip).
CLI AI Analysis Telemetry
cli/src/ai/telemetry.ts, tests/ai-analysis-telemetry.unit.test.ts
Adds typed telemetry for CLI AI analysis choices and results, emits structured events and swallows send errors; tests validate tags and conditional error_status.
Backend AI Analyze Tracking
supabase/functions/_backend/public/build/ai_analyze.ts, tests
Emits AI Build Analysis Result on all exit paths with closed result enum, includes owner_org for attribution, logs durations and logsBytes, and ensures privacy by not including analysis text in tags.
Backend Build Lifecycle Helpers & Tests
supabase/functions/_backend/utils/build_tracking.ts, tests/build-tracking-helpers.unit.test.ts
Adds classifyBuildTransition, mapBuildFailureCategory, emitBuildTransitionEvent to derive transitions and failure categories (timeout/validation/builder/unknown); tests cover transition logic, timeout precedence, idempotency, and mapping.
Build Request Event Emission
supabase/functions/_backend/public/build/request.ts
Emits a Build Requested event on the build-lifecycle channel after persisting a build request with organization grouping and tags (app_id, platform, build_mode).
Build Start/Status Emit Wiring
supabase/functions/_backend/public/build/start.ts, supabase/functions/_backend/public/build/status.ts
When starting/updating builds, select extra fields (status, platform, build_mode) and call emitBuildTransitionEvent on successful status updates with previous/effective statuses and build metadata.
Cron Reconciliation Build Lifecycle Events
supabase/functions/_backend/triggers/cron_reconcile_build_status.ts
During reconciliation captures previous status, classifies transitions, constructs per-transition event metadata/tags (including duration and failure category), and emits build-lifecycle events via tracking utilities.
Design Specification and Documentation
docs/superpowers/specs/2026-05-18-capgo-builder-posthog-tracking-design.md
Comprehensive spec for onboarding/upload/build lifecycle telemetry: event schemas, closed enums, CLI best-effort delivery, cron idempotency, UI duration wiring, privacy/error handling, and tests.

Sequence Diagram(s)

sequenceDiagram
  participant OnboardingUI
  participant trackBuilderOnboardingStep
  participant sendEvent
  OnboardingUI->>trackBuilderOnboardingStep: step change, platform, appId, durationMs, error/errorCategory
  trackBuilderOnboardingStep->>sendEvent: Builder Onboarding Step (tags)
  activate sendEvent
  sendEvent-->>trackBuilderOnboardingStep: resolved/ignored error
  deactivate sendEvent
Loading
sequenceDiagram
  participant CronReconciler
  participant emitBuildTransitionEvent
  participant sendEventToTracking
  CronReconciler->>emitBuildTransitionEvent: previousStatus, effectiveStatus, timeoutApplied, error, duration, build metadata
  emitBuildTransitionEvent->>sendEventToTracking: Build <Transition> (tags)
  activate sendEventToTracking
  sendEventToTracking-->>emitBuildTransitionEvent: resolved/ignored error
  deactivate sendEventToTracking
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Cap-go/capgo#2064: Android onboarding telemetry and OAuth mapping work related to Android error mapping and UI instrumentation.
  • Cap-go/capgo#2284: Related changes in AI analyze flow and telemetry hooks used by CLI upload/analysis instrumentation.
  • Cap-go/capgo#2070: Timeout detection/cancellation logic that interacts with reconciliation/timeouts used by build lifecycle transition emission.

Suggested reviewers

  • zinc-builds

Poem

🐰 I hopped through code with ears so keen,

I tracked the steps both seen and unseen,
From uploads to errors neatly cast,
We log each hop and learn at last. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately summarizes the main changes: adding PostHog tracking for Capgo Builder onboarding and build lifecycle events across CLI and server-side components.
Description check ✅ Passed The PR description comprehensively covers the changes, including a detailed summary of the two event families (CLI onboarding and server-side build lifecycle), closed-enum error categorization, implementation details, test plan, and documentation references. However, the description is primarily a detailed prose explanation rather than following the template's structured sections (Summary, Test plan, Screenshots, Checklist).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented May 18, 2026

Merging this PR will not alter performance

✅ 43 untouched benchmarks
⏩ 2 skipped benchmarks1


Comparing feat/builder-tracking-posthog (ab10bbd) with main (fa6727e)

Open in CodSpeed

Footnotes

  1. 2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
supabase/functions/_backend/triggers/cron_reconcile_build_status.ts (1)

214-225: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Transition telemetry is race-prone without compare-and-set on status.

previousStatus is read from a stale snapshot, but Line 216 update only matches id. Concurrent reconciliations can both update and both emit the same transition event. Add a status guard in the update (eq('status', previousStatus)) and emit tracking only when that guarded update actually applies.

Also applies to: 252-291

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@supabase/functions/_backend/triggers/cron_reconcile_build_status.ts` around
lines 214 - 225, The current update to build_requests uses only .eq('id',
build.id) and reads previousStatus from a snapshot, so concurrent
reconciliations can both update and both emit transition telemetry; change the
update call in the reconciliation logic to include a status guard by adding
.eq('status', previousStatus) to the Supabase query (the call on
supabase.from('build_requests').update({...}).eq('id', build.id)), then only
emit the transition/telemetry when the guarded update actually applied (i.e.,
when the update returned a successful result/affected row(s) rather than an
error or zero rows). Apply the same change to the second reconciliation block
covering the code around the other update/emit (the block referenced in the
review for lines 252-291).
🧹 Nitpick comments (2)
docs/superpowers/specs/2026-05-18-capgo-builder-posthog-tracking-design.md (1)

110-136: 💤 Low value

Add language identifier to code fence.

The fenced code block at line 110 lacks a language specification, which triggers a markdown linting warning (MD040). Since this is an ASCII architecture diagram, specify text or plaintext as the language identifier.

📝 Proposed fix
-```
+```text
 ONBOARDING:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/superpowers/specs/2026-05-18-capgo-builder-posthog-tracking-design.md`
around lines 110 - 136, The fenced ASCII diagram starting with "ONBOARDING:"
should include a language identifier to satisfy MD040; update the opening code
fence that wraps the diagram (the block showing ONBOARDING, BUILDS,
sendEventToTracking, trackPosthogEvent, etc.) to use ```text (or ```plaintext)
instead of just ```, so the block becomes a text/plaintext code fence.
tests/builder-onboarding-telemetry.unit.test.ts (1)

24-143: ⚡ Quick win

Use it.concurrent for these test cases to match repo test policy.

These tests currently use it(...); the repository test guideline for tests/**/*.test.ts requires it.concurrent(...) for parallel execution.

As per coding guidelines, "Design all tests for parallel execution across files; use it.concurrent() instead of it() to run tests in parallel within the same file for faster CI/CD".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/builder-onboarding-telemetry.unit.test.ts` around lines 24 - 143,
Update each test declaration in this file to run concurrently by replacing
it(...) with it.concurrent(...); specifically modify the tests with descriptions
like "builds the expected payload and calls sendEvent once", "includes
error_category only when an error is provided", "uses the Android mapper when
platform is android", "skips when CAPGO_DISABLE_TELEMETRY is set", "skips when
CAPGO_DISABLE_POSTHOG is set", "swallows errors thrown by sendEvent", "does not
include duration_ms when undefined", and "uses pre-computed errorCategory when
provided (skipping the mapper)" so they call it.concurrent and otherwise keep
the same callbacks that call trackBuilderOnboardingStep and assert on
sendEventMock.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cli/src/build/onboarding/android/ui/app.tsx`:
- Around line 68-70: Reorder the three imports in app.tsx to satisfy the
perfectionist/sort-imports rule: locate the imports for
mapAndroidOnboardingError, trackBuilderOnboardingStep, and
ANDROID_STEP_PROGRESS/getAndroidPhaseLabel and reorder them to match the
repository's configured import groups/sort order (or run the linter autofix) so
the import sequence and grouping match perfectionist's expectations.

In `@docs/superpowers/plans/2026-05-18-capgo-builder-posthog-tracking.md`:
- Line 572: The current code wraps the wizard state error string as new
Error(error) which loses properties inspected by mapIosOnboardingError and
mapAndroidOnboardingError; instead pass the original error object (from wizard
state) or precompute and pass the mapped category to the telemetry helper.
Locate where error: step === 'error' && error ? new Error(error) : undefined is
set and change it to forward the original error object (e.g., errorObject) or
call the mapper (mapIosOnboardingError / mapAndroidOnboardingError) at that
point and pass the resulting category into the telemetry helper so the telemetry
receives the preserved onboarding error_category.

In `@docs/superpowers/specs/2026-05-18-capgo-builder-posthog-tracking-design.md`:
- Line 206: Update the stale endpoint reference: find the phrase mentioning "the
new /private/track_onboarding endpoint" in the spec and either change it to
reference the reused "/private/events" endpoint or remove the endpoint-specific
wording (e.g., replace with "No per-org rate limit on onboarding events");
ensure the surrounding text remains consistent with the architectural decision
to reuse /private/events.
- Line 23: Update the sentence that reads "Sent from the CLI through a new
backend endpoint" to clarify that the CLI uses the existing endpoint by
referencing "/private/events" (e.g., "Sent from the CLI through the existing
/private/events endpoint") so it matches the architectural decision described in
the section around the CLI wizard step transition and the reuse of the
/private/events endpoint.

In `@supabase/functions/_backend/public/build/request.ts`:
- Around line 317-329: The awaited sendEventToTracking call can throw and should
be best-effort: wrap the call to sendEventToTracking(c, { ... }) in a try/catch
so any rejection is caught, log the error (e.g., console.error or the
request/logger available on c) with context "Build Requested telemetry failed"
and do not rethrow or change the API response; keep the original behavior for
successful paths. Ensure you reference the existing sendEventToTracking
invocation and its context parameter c when applying the try/catch so telemetry
failures do not convert a successful build request into an API error.

In `@supabase/functions/_backend/utils/build_tracking.ts`:
- Around line 19-24: The current early-return checks evaluate equality before
honoring a timeout override, causing input.timeoutApplied true with
input.previous === input.next to return null; update the conditional order or
add a branch so that when input.timeoutApplied is true the function returns
'timed_out' regardless of equality — e.g., check input.timeoutApplied (or add a
specific conditional for input.timeoutApplied && input.previous === input.next)
before/above the input.previous === input.next check in the build/tracking logic
so callers passing raw next statuses get the 'timed_out' transition.

In `@tests/builder-onboarding-telemetry.unit.test.ts`:
- Line 9: Move the import of trackBuilderOnboardingStep so it appears at the top
of tests/builder-onboarding-telemetry.unit.test.ts before any runtime
statements; the lint error is caused by the import occurring after executable
code, so locate the current import of trackBuilderOnboardingStep and place it
with the other top-level imports to satisfy the import/first rule.

---

Outside diff comments:
In `@supabase/functions/_backend/triggers/cron_reconcile_build_status.ts`:
- Around line 214-225: The current update to build_requests uses only .eq('id',
build.id) and reads previousStatus from a snapshot, so concurrent
reconciliations can both update and both emit transition telemetry; change the
update call in the reconciliation logic to include a status guard by adding
.eq('status', previousStatus) to the Supabase query (the call on
supabase.from('build_requests').update({...}).eq('id', build.id)), then only
emit the transition/telemetry when the guarded update actually applied (i.e.,
when the update returned a successful result/affected row(s) rather than an
error or zero rows). Apply the same change to the second reconciliation block
covering the code around the other update/emit (the block referenced in the
review for lines 252-291).

---

Nitpick comments:
In `@docs/superpowers/specs/2026-05-18-capgo-builder-posthog-tracking-design.md`:
- Around line 110-136: The fenced ASCII diagram starting with "ONBOARDING:"
should include a language identifier to satisfy MD040; update the opening code
fence that wraps the diagram (the block showing ONBOARDING, BUILDS,
sendEventToTracking, trackPosthogEvent, etc.) to use ```text (or ```plaintext)
instead of just ```, so the block becomes a text/plaintext code fence.

In `@tests/builder-onboarding-telemetry.unit.test.ts`:
- Around line 24-143: Update each test declaration in this file to run
concurrently by replacing it(...) with it.concurrent(...); specifically modify
the tests with descriptions like "builds the expected payload and calls
sendEvent once", "includes error_category only when an error is provided", "uses
the Android mapper when platform is android", "skips when
CAPGO_DISABLE_TELEMETRY is set", "skips when CAPGO_DISABLE_POSTHOG is set",
"swallows errors thrown by sendEvent", "does not include duration_ms when
undefined", and "uses pre-computed errorCategory when provided (skipping the
mapper)" so they call it.concurrent and otherwise keep the same callbacks that
call trackBuilderOnboardingStep and assert on sendEventMock.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 86155c78-e235-4024-b26b-5c9006cfabab

📥 Commits

Reviewing files that changed from the base of the PR and between d9c7cb4 and c595d2c.

📒 Files selected for processing (14)
  • cli/src/build/onboarding/android/types.ts
  • cli/src/build/onboarding/android/ui/app.tsx
  • cli/src/build/onboarding/error-categories.ts
  • cli/src/build/onboarding/telemetry.ts
  • cli/src/build/onboarding/types.ts
  • cli/src/build/onboarding/ui/app.tsx
  • docs/superpowers/plans/2026-05-18-capgo-builder-posthog-tracking.md
  • docs/superpowers/specs/2026-05-18-capgo-builder-posthog-tracking-design.md
  • supabase/functions/_backend/public/build/request.ts
  • supabase/functions/_backend/triggers/cron_reconcile_build_status.ts
  • supabase/functions/_backend/utils/build_tracking.ts
  • tests/build-tracking-helpers.unit.test.ts
  • tests/builder-onboarding-telemetry.unit.test.ts
  • tests/onboarding-error-categories.unit.test.ts

Comment thread cli/src/build/onboarding/android/ui/app.tsx Outdated
Comment thread docs/superpowers/plans/2026-05-18-capgo-builder-posthog-tracking.md Outdated
Comment thread docs/superpowers/specs/2026-05-18-capgo-builder-posthog-tracking-design.md Outdated
Comment thread supabase/functions/_backend/public/build/request.ts Outdated
Comment thread supabase/functions/_backend/utils/build_tracking.ts Outdated
Comment thread tests/builder-onboarding-telemetry.unit.test.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c595d2c6d7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread supabase/functions/_backend/triggers/cron_reconcile_build_status.ts Outdated
- android/ui/app.tsx: reorder telemetry imports next to other
  ../../ depth imports to satisfy perfectionist/sort-imports
- builder-onboarding-telemetry.unit.test.ts: move
  trackBuilderOnboardingStep import above the vi.mock setup to
  satisfy import/first (vi.mock is hoisted by Vitest)
- build_tracking.ts: timeoutApplied now overrides the
  previous===next no-change check so a stale-snapshot caller
  passing equal statuses with timeoutApplied=true still emits
  'timed_out'. Adds matching test case.
- docs/spec: drop stale references to the never-built
  /private/track_onboarding endpoint; clarify the CLI uses the
  existing /private/events route. Tag the ASCII architecture
  diagram with the 'text' language for MD040.
- docs/plan: replace the stale `new Error(error)` snippet in
  both wizard wirings with the shipped errorCategoryRef +
  errorCategory: ... pattern, and add a one-line note pointing
  to the fix.

Skipped (with reasons):
- request.ts try/catch around sendEventToTracking: redundant.
  sendEventToTracking already swallows per-provider errors via
  runTrackedCall and returns Promise.resolve(null) via
  backgroundTask in production. Adding a wrap would diverge
  from every other call site (on_app_create.ts, stripe_event.ts,
  etc.) that follows the same pattern.
- cron_reconcile_build_status.ts optimistic-concurrency guard
  on the build_requests update: pre-existing systemic concern.
  None of the cron's other state mutations (recordBuildTime,
  status updates) use such a guard either. Adding it just for
  telemetry would be partial and inconsistent; out of scope.
- Converting it() to it.concurrent() in
  builder-onboarding-telemetry.unit.test.ts: the tests mutate
  shared process.env via beforeEach/afterEach. Concurrent
  execution would race on the env vars and produce flakes.
The 1090-line task-by-task implementation plan lived under
docs/superpowers/plans/ during development. With the feature
shipped, it's archived in the user's Obsidian wiki at
projects/capgo/plans/2026-05-18-builder-posthog-tracking.md
along with the distilled architectural decisions and lessons.
Keeping it in the repo as a permanent artifact adds noise to
future code search results without ongoing value. The spec
remains in-tree.
Fills the observability gap between Build Requested (row inserted)
and Build Started (builder picks up job). The TUS upload of the
project tarball is fired-and-forgot from the existing onSuccess/
onError callbacks plus a started emission right before
tus.Upload.start().

Tags: app_id, platform, build_mode, job_id, upload_size_bytes,
upload_duration_seconds? (terminal), failure_category? (failed only).

Closed-enum failure_category: network_error, unauthorized,
payload_too_large, storage_failure, unknown — mapped via
structural typing on error.originalResponse?.getStatus?.() so
no hard import of tus.DetailedError.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/builder-upload-telemetry.unit.test.ts (1)

50-157: ⚡ Quick win

Use it.concurrent() consistently in this test file.

This suite still uses it(...) for multiple cases; convert them to it.concurrent(...) to match the repo rule for parallel test execution.

As per coding guidelines: “tests/**/*.test.ts: ... use it.concurrent() instead of it() to run tests in parallel within the same file for faster CI/CD”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/builder-upload-telemetry.unit.test.ts` around lines 50 - 157, Replace
the synchronous Jest tests with concurrent ones: change each it(...) to
it.concurrent(...) for the tests invoking trackBuilderUpload (e.g., the tests
with descriptions "emits Builder Upload Started with size but no duration or
failure_category", "emits Builder Upload Succeeded with duration and size",
"emits Builder Upload Failed with failure_category from a 413", "skips when
CAPGO_DISABLE_TELEMETRY is set", and "swallows errors thrown by sendEvent") so
they run in parallel; ensure you only update the test declarations and keep the
existing payload/assertions and mock setup (sendEventMock, process.env use)
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/builder-upload-telemetry.unit.test.ts`:
- Around line 6-8: The test's mock target path doesn't match the actual import
used by the module-under-test: update the vi.mock call in
tests/builder-upload-telemetry.unit.test.ts to mock the exact module specifier
that trackBuilderUpload imports (the '../utils.js' specifier and relative path
used by cli/src/build/telemetry.ts) so that sendEvent is properly stubbed;
locate the vi.mock(...) that currently references '../cli/src/utils.ts' and
change it to the same '../utils.js' specifier (or otherwise mirror the
module-under-test's import path) so sendEventMock is injected.

---

Nitpick comments:
In `@tests/builder-upload-telemetry.unit.test.ts`:
- Around line 50-157: Replace the synchronous Jest tests with concurrent ones:
change each it(...) to it.concurrent(...) for the tests invoking
trackBuilderUpload (e.g., the tests with descriptions "emits Builder Upload
Started with size but no duration or failure_category", "emits Builder Upload
Succeeded with duration and size", "emits Builder Upload Failed with
failure_category from a 413", "skips when CAPGO_DISABLE_TELEMETRY is set", and
"swallows errors thrown by sendEvent") so they run in parallel; ensure you only
update the test declarations and keep the existing payload/assertions and mock
setup (sendEventMock, process.env use) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 24f72ddf-b9ee-4b69-bcbc-f55332cee671

📥 Commits

Reviewing files that changed from the base of the PR and between f4a76b6 and 4ec5f01.

📒 Files selected for processing (4)
  • cli/src/build/request.ts
  • cli/src/build/telemetry.ts
  • docs/superpowers/specs/2026-05-18-capgo-builder-posthog-tracking-design.md
  • tests/builder-upload-telemetry.unit.test.ts
✅ Files skipped from review due to trivial changes (1)
  • docs/superpowers/specs/2026-05-18-capgo-builder-posthog-tracking-design.md

Comment thread tests/builder-upload-telemetry.unit.test.ts
Removes the CAPGO_DISABLE_TELEMETRY / CAPGO_DISABLE_POSTHOG checks
from the new trackBuilderOnboardingStep and trackBuilderUpload
helpers. The existing posthog.ts exception-capture path still honors
those vars; this PR does not extend their coverage.

Removed:
- `telemetryDisabled` / `isTruthyEnv` helpers (both files)
- `node:process` imports (no longer needed)
- The skip-when-env-set test cases (3 total)
- afterEach hooks that cleared the env vars

A unified opt-out at the sendEvent layer can be added in a separate
PR if desired.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/builder-upload-telemetry.unit.test.ts (1)

43-136: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use it.concurrent() for the trackBuilderUpload cases
tests/builder-upload-telemetry.unit.test.ts (lines 43-136) uses it(...) for the trackBuilderUpload tests (lines 43, 75, 99, 124); switch them to it.concurrent(...) to match the repo’s tests/**/*.test.ts parallel-execution requirement.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/builder-upload-telemetry.unit.test.ts` around lines 43 - 136, Change
the four test declarations that call trackBuilderUpload from it(...) to
it.concurrent(...): the tests that invoke trackBuilderUpload (the cases
referencing trackBuilderUpload and sendEventMock) should use it.concurrent to
enable parallel execution; keep the async keyword and existing assertions
(including the sendEventMock behavior) unchanged so only the test invocation is
updated to it.concurrent.
tests/builder-onboarding-telemetry.unit.test.ts (1)

16-110: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Switch to it.concurrent(), but make the tests concurrency-safe
In tests/builder-onboarding-telemetry.unit.test.ts (lines 16-110), the cases currently share a hoisted sendEventMock and reset it in beforeEach, and they assert toHaveBeenCalledTimes(1) / sendEventMock.mock.calls[0]. Converting these to it.concurrent() will introduce races and flaky assertions unless the mock handling and expectations are refactored to be parallel-safe (e.g., avoid per-test resets and select the correct call by matching the expected tags.step/platform instead of relying on call order).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/builder-onboarding-telemetry.unit.test.ts` around lines 16 - 110, Tests
currently assume global call order on sendEventMock (using
sendEventMock.mock.calls[0] and toHaveBeenCalledTimes(1]), which will race when
converted to it.concurrent(); update each test that references sendEventMock to
locate its own call by matching unique identifiers (e.g., the apikey argument
and payload.tags.step or payload.tags.platform) instead of indexing mock.calls,
and replace call-count assertions with existence checks (e.g., assert a matching
call was found) so tests are independent; keep sendEventMock as a shared mock
(set up once) but stop resetting/expecting global call counts in beforeEach so
concurrent tests only inspect their own matching call when calling
trackBuilderOnboardingStep.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/superpowers/specs/2026-05-18-capgo-builder-posthog-tracking-design.md`:
- Line 204: Update the spec to use the actual helper name used in code/tests:
replace the documented `trackOnboardingStep(input)` with
`trackBuilderOnboardingStep(input)` in the spec text; ensure the description
still matches the implementation in `cli/src/build/onboarding/telemetry.ts`
(best-effort fetch to `/private/events` with AbortController timeout 1500ms and
never throwing) so the spec and the function `trackBuilderOnboardingStep` are
aligned.

---

Outside diff comments:
In `@tests/builder-onboarding-telemetry.unit.test.ts`:
- Around line 16-110: Tests currently assume global call order on sendEventMock
(using sendEventMock.mock.calls[0] and toHaveBeenCalledTimes(1]), which will
race when converted to it.concurrent(); update each test that references
sendEventMock to locate its own call by matching unique identifiers (e.g., the
apikey argument and payload.tags.step or payload.tags.platform) instead of
indexing mock.calls, and replace call-count assertions with existence checks
(e.g., assert a matching call was found) so tests are independent; keep
sendEventMock as a shared mock (set up once) but stop resetting/expecting global
call counts in beforeEach so concurrent tests only inspect their own matching
call when calling trackBuilderOnboardingStep.

In `@tests/builder-upload-telemetry.unit.test.ts`:
- Around line 43-136: Change the four test declarations that call
trackBuilderUpload from it(...) to it.concurrent(...): the tests that invoke
trackBuilderUpload (the cases referencing trackBuilderUpload and sendEventMock)
should use it.concurrent to enable parallel execution; keep the async keyword
and existing assertions (including the sendEventMock behavior) unchanged so only
the test invocation is updated to it.concurrent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 65787f25-6e7d-49cc-9190-c922fc6da202

📥 Commits

Reviewing files that changed from the base of the PR and between 4ec5f01 and bf515aa.

📒 Files selected for processing (5)
  • cli/src/build/onboarding/telemetry.ts
  • cli/src/build/telemetry.ts
  • docs/superpowers/specs/2026-05-18-capgo-builder-posthog-tracking-design.md
  • tests/builder-onboarding-telemetry.unit.test.ts
  • tests/builder-upload-telemetry.unit.test.ts
💤 Files with no reviewable changes (2)
  • cli/src/build/onboarding/telemetry.ts
  • cli/src/build/telemetry.ts

…-posthog

# Conflicts:
#	cli/src/build/onboarding/android/ui/app.tsx
#	cli/src/build/onboarding/ui/app.tsx
Fixes lifecycle observability: the cron-only transition emission
missed every happy-path build because the cron's stale filter excludes
builds with fresh updated_at. Adds emitBuildTransitionEvent helper
called from cron + public/build/start.ts (Build Started) +
public/build/status.ts (Build Succeeded/Failed/Timed Out).

Adds AI Build Analysis tracking:
- Server: 'AI Build Analysis Requested' + 'AI Build Analysis Result'
  (closed-enum result: success | already_analyzed | invalid_state |
  unauthorized | builder_error | config_error). Privacy boundary: no
  AI diagnosis text in any tag.
- CLI: 'CLI AI Build Analysis Choice' (closed-enum choice:
  capgo_ai | local_ai | skip | auto_upload) + 'CLI AI Build Analysis
  Result' (mapped from PostAnalyzeResult.kind).
…ry/catch

Three sibling tracking sites were left bare while emitBuildTransitionEvent
gained a try/catch + cloudlogErr fallback during the lifecycle-events work:

- public/build/request.ts: 'Build Requested' direct emission
- public/build/ai_analyze.ts: emitAiAnalysisResult helper (wraps the
  'AI Build Analysis Result' send)
- public/build/ai_analyze.ts: 'AI Build Analysis Requested' direct emission

All three now match the established pattern: a try/catch around
sendEventToTracking that routes any orchestration-layer failure to
cloudlogErr without rethrowing. Successful tracking still proceeds
normally — the wrap is purely defensive against an unexpected throw
(e.g., backgroundTask unavailable in tests).

This closes a real consistency gap inside this PR — before this commit,
a partial test mock of utils.ts could make backgroundTask undefined and
turn a successful request handler into a 500.
Three lifecycle emission sites (cron, public/build/start.ts,
public/build/status.ts) previously updated rows without a
compare-and-set on `status`, then emitted unconditionally on
success. Two concurrent writers (cron + CLI poller, or two
dashboard pollers) could both win the unconditional update and
both emit the same terminal transition event, inflating PostHog
lifecycle counts.

Each site now appends `.eq('status', previousStatus).select('id')`
to the update chain and only emits when the affected-row set is
non-empty. The CAS loser silently skips emission — the winning
writer has already fired the event.

recordBuildTime in the cron stays unconditional on terminal status:
it's idempotent at the DB layer, and missing it on the CAS-lost
branch would let billing skip a build.
start.ts marks builds as 'failed' at two call sites — the builder
rejection branch (line 213) and the outer catch (line 340). Both
went through markBuildAsFailed, which previously only UPDATEd the
status without emitting the lifecycle event. The Build Failed
funnel was missing the entire "builder rejected my start request"
class of terminal transitions.

markBuildAsFailed now:
1. SELECTs the row to capture previousStatus + platform + build_mode
   + owner_org (fields the lifecycle payload needs)
2. UPDATEs with a CAS guard (.eq('status', previousStatus)) so a
   racing writer (cron, status poller) doesn't get double-emitted
3. Calls emitBuildTransitionEvent on a non-empty affected-row set

The CAS-lost branch silently skips emission — another writer
already advanced the row and emitted on its behalf.

Adds tests/build-start-log-token.test.ts case 'emits Build Failed
when the builder rejects the start request' that mocks fetch to
return 500 and asserts the lifecycle event fires with the right tags.
…gId resolves

Two related telemetry gaps in the iOS + Android wizards:

1. The very first step ('welcome' or resumed step) never reached
   PostHog because `stepTimingRef` was initialized with the current
   step value, making `previous.step === step` true on first render
   and tripping the duplicate-skip guard. Initialize with `null`
   instead, and treat `previous.step === null` as the initial-step
   sentinel.

2. The async org-id resolution chain (createSupabaseClient +
   getOrganizationId — two HTTP round-trips) can take 1-3 seconds
   on slow networks. Any step transitions during that window were
   silently dropped because the effect early-returned on missing
   `resolvedOrgId`. Now buffer those events in `pendingTelemetryRef`
   and drain them in order when the org id lands.

Drain ordering matters: the same effect re-fires when
`resolvedOrgId` transitions from null to a real value (even if
`step` is unchanged), so the drain runs BEFORE the duplicate-skip
guard. Otherwise the backlog would never flush when the user paused
on a step while org id resolved.

Closes the [P3] "Onboarding step telemetry misses the initial step
and can drop early transitions while resolvedOrgId is still loading"
finding.
`vue-tsc --noEmit` (run by `bun typecheck` in CI) fails on the CLI
source's imports of `@capacitor/cli/dist/config` and
`@capacitor/cli/dist/util/monorepotools` — those subpaths don't
ship .d.ts files.

The CLI's own tsc (cli/tsconfig.json) handles these fine, but the
root tsconfig pulls in CLI source as a side-effect of root-level
test files importing from `cli/src/...`. The exclude in the root
tsconfig only prevents auto-inclusion; transitively-imported files
are still processed.

Pre-existing on main but invisible there because tests.yml only
runs on pull_request (not push to main). My new test files
(builder-onboarding-telemetry, builder-upload-telemetry,
ai-analysis-telemetry) added more import chains that hit the same
two subpaths, surfacing the failure on PR CI.

Adding a one-file declaration shim. Removes the TS7016 errors
without touching the import sites or cli/tsconfig.json.
…sts from root tsconfig

The shim was masking a structural smell: vue-tsc was reaching into
cli/src/* via test files that import internal helpers (not the
@capgo/cli/sdk public surface). The root tsconfig already excludes
cli/, but TS `exclude` doesn't stop transitive import processing
from files that ARE in the program.

Cleaner: exclude the four tests whose imports drag CLI internals
into vue-tsc's program. Vitest still runs them via its own
esbuild/swc transform — these are unit tests of pure functions,
the imported module surfaces aren't large, and runtime test
failures will catch any drift.

Tests excluded (all added in this PR):
- tests/ai-analysis-telemetry.unit.test.ts
- tests/builder-onboarding-telemetry.unit.test.ts
- tests/builder-upload-telemetry.unit.test.ts
- tests/onboarding-error-categories.unit.test.ts

Matches the existing precedent of `tests/device_comparison.test.ts`
already being excluded.
@sonarqubecloud
Copy link
Copy Markdown

@WcaleNieWolny WcaleNieWolny merged commit 9523f13 into main May 21, 2026
57 of 58 checks passed
@WcaleNieWolny WcaleNieWolny deleted the feat/builder-tracking-posthog branch May 21, 2026 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant