feat: forward network_events to Daydream /v1/metrics via reliable client reporter#1041
feat: forward network_events to Daydream /v1/metrics via reliable client reporter#1041seanhanca wants to merge 1 commit into
Conversation
…ent reporter Add a MetricsReporter that buffers stream_trace lifecycle events and forwards them in batches to the Daydream POST /v1/metrics endpoint. Implements the fail-loud / no-loss client contract: exponential backoff on 5xx/network errors, pause on 401, rate-limit awareness on 429, and optional disk-backed resume buffer for crash-safe delivery. In local mode the reporter is fed in-process via an event-sink hook in kafka_publisher.py. In cloud/livepeer mode the runner writes network_events onto the trickle events channel and LivepeerClient routes them into the reporter on the client side. Existing direct-to-Kafka publishing is kept intact (augment, not replace). Signed-off-by: qianghan <qiang@livepeer.org> Co-authored-by: Cursor <cursoragent@cursor.com>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🚀 fal.ai Preview Deployment
Testing on Cloud |
|
Superseded by adding MetricsReporter directly to PR #1040 (feat/telemetry-over-trickle) |
Summary
MetricsReporterclient that reliably forwardsstream_tracelifecycle events (the samenetwork_eventsenvelope) to Daydream'sPOST /v1/metricsendpoint, implementing the fail-loud / no-loss client contract (batched HTTP POST, exponential backoff on 5xx, pause on 401, rate-limit awareness on 429, optional disk-backed crash-safe buffer).kafka_publisher.py. In cloud/livepeer mode, the runner writesnetwork_eventmessages onto the trickle events channel andLivepeerClientroutes them into the reporter on the Scope client side.httpx(already a dep).Architecture
Files changed
src/scope/server/metrics_reporter.pyMetricsReporterwith thread-safe enqueue, async flush loop, fail-loud response handling, disk buffer, global accessorssrc/scope/server/kafka_publisher.pyset_event_sink()+_dispatch_to_sink()called after envelope buildsrc/scope/cloud/livepeer_app.py_subscribe_controlto forward envelopes over events channelsrc/scope/server/livepeer_client.pynetwork_eventmessages from events loop intoreport_event()src/scope/server/app.pyMetricsReporterstart/stop into lifespan; register in-process event sinktests/test_metrics_reporter.pyConfig (env vars)
DAYDREAM_METRICS_URL— endpoint (default:https://api.daydream.monster/v1/metrics)SCOPE_CLOUD_API_KEY— Bearer token (required to enable; already used for cloud connect)SCOPE_METRICS_ENABLED— explicit enable/disable (default: on when key present)SCOPE_METRICS_FLUSH_INTERVAL_MS,SCOPE_METRICS_MAX_BATCH,SCOPE_METRICS_MAX_BUFFER,SCOPE_METRICS_BUFFER_PATHTest plan
uv run ruff check src/— all checks passeduv run ruff format --check src/— all formatteduv run pytest tests/test_metrics_reporter.py -v— 33/33 passedfrom scope.server.app import app)SCOPE_CLOUD_API_KEY, start Scope, verify events appear in Kafkanetwork_eventstopicMade with Cursor