Skip to content

feat(factory): add service state tracking, AwaitHealthy, depends_on, and /healthz#10671

Open
therealpandey wants to merge 15 commits intomainfrom
feat/service-state-healthz
Open

feat(factory): add service state tracking, AwaitHealthy, depends_on, and /healthz#10671
therealpandey wants to merge 15 commits intomainfrom
feat/service-state-healthz

Conversation

@therealpandey
Copy link
Member

@therealpandey therealpandey commented Mar 22, 2026

Summary

  • Add explicit lifecycle state tracking (starting/running/failed) to factory.Registry services, modeled after Guava's ServiceManager
  • Add ServiceWithHealthy interface embedding Service + Healthy — services signal readiness via a channel, eliminating the need for unwrapService
  • Add AwaitHealthy(ctx) method that blocks until all services reach running state
  • Add dependsOn parameter to NewNamedService for declaring service dependencies with validation (unknown refs logged and dropped, self-deps ignored, cycles detected via gonum's Tarjan SCC and returned as errors)
  • Add factory.Handler interface with Healthz, Readyz, Livez methods, implemented via NewHandler(registry)
  • Wire /api/v2/healthz, /api/v2/readyz, /api/v2/livez endpoints through signozapiserver returning 200/503 with per-service state
  • Implement ServiceWithHealthy for authz service (pkg + ee) — healthy after OpenFGA store/model setup completes
  • Implement ServiceWithHealthy for user service — healthy after root user reconciliation succeeds
  • User service declares dependency on authz via dependsOn
  • Integration tests use /api/v2/healthz for readiness checks with error-level logging on failures

Test plan

  • go test -race ./pkg/factory/... — 16 tests covering state transitions, AwaitHealthy, dependency ordering, failure propagation, self-dep/cycle detection, unknown dep handling
  • go build ./... — full project builds clean
  • go-lint — 0 issues
  • Manual: make go-run-communityGET /api/v2/healthz returns 200 with all services running
  • Integration tests: test_setup asserts /api/v2/healthz returns 200

…and /healthz endpoint

Add explicit lifecycle state tracking to factory.Registry services
(starting/running/failed) modeled after Guava's ServiceManager. Services
can declare dependencies via NewNamedService(..., dependsOn) which are
validated for unknown refs and cycles at registry creation. AwaitHealthy
blocks until all services reach running state. A /healthz endpoint is
wired through signozapiserver returning 200/503 with per-service state.
@github-actions github-actions bot added the enhancement New feature or request label Mar 22, 2026
…les, fix test assertions

Replace custom DFS cycle detection with gonum's topo.Sort + TarjanSCC.
Dependency cycles now return an error from NewRegistry instead of being
silently dropped. Use assert for final test assertions and require only
for intermediate setup errors.
…ers struct

Move Handler implementation to a private handler struct with NewHandler
constructor instead of methods on *Registry. Route handler through the
existing Handlers struct as RegistryHandler. Rename healthz.go to
registry.go in signozapiserver. Fix handler_test.go for new param.
…, user depends on authz

Add ServiceWithHealthy interface embedding Service + Healthy. NamedService
now delegates Healthy() to the underlying service, eliminating unwrapService.
AuthZ interface requires Healthy(), implemented in both pkg and ee providers.
User service declares dependency on authz via dependsOn.
@therealpandey therealpandey requested a review from a team as a code owner March 22, 2026 14:46
@therealpandey therealpandey added the safe-to-integrate Run integration tests label Mar 22, 2026
@therealpandey therealpandey added safe-to-integrate Run integration tests and removed safe-to-integrate Run integration tests labels Mar 22, 2026
@therealpandey therealpandey added safe-to-integrate Run integration tests and removed safe-to-integrate Run integration tests labels Mar 22, 2026
User service signals healthy after successful root user reconciliation
or immediately when disabled. User Service interface now embeds
factory.ServiceWithHealthy.
@therealpandey therealpandey added safe-to-integrate Run integration tests and removed safe-to-integrate Run integration tests labels Mar 22, 2026
@therealpandey therealpandey added safe-to-integrate Run integration tests and removed safe-to-integrate Run integration tests labels Mar 22, 2026
@therealpandey therealpandey added safe-to-integrate Run integration tests and removed safe-to-integrate Run integration tests labels Mar 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request safe-to-integrate Run integration tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant