Skip to content

feat: outbound webhooks on session state transitions#36

Open
mirchaemanuel wants to merge 24 commits into
illegalstudio:mainfrom
mirchaemanuel:feature/outbound-webhooks
Open

feat: outbound webhooks on session state transitions#36
mirchaemanuel wants to merge 24 commits into
illegalstudio:mainfrom
mirchaemanuel:feature/outbound-webhooks

Conversation

@mirchaemanuel
Copy link
Copy Markdown
Contributor

@mirchaemanuel mirchaemanuel commented May 19, 2026

Summary

Adds outbound HTTP webhooks that fire on session activity state transitions. Users configure one or more endpoints in ~/.config/lazyagent/config.json with filters on event type and agent. Async best-effort delivery with optional HMAC-SHA256 signing (GitHub-style header).

  • New internal/core.EventBus — first typed pub-sub in the project — published from ActivityTracker.Update whenever a session's activity changes
  • New internal/webhook/ package: Dispatcher with fan-out + worker pool (4 workers), retry on 5xx/network with backoff [1s, 5s, 30s] (no retry on 4xx), 2 s dedup window for duplicate transitions emitted by multiple in-process managers (TUI + API + GUI), optional X-Lazyagent-Signature: sha256=<hex>
  • Bus wired into TUI, API, and GUI tray. Tray process is forked (cross-process) so it spawns its own bus + dispatcher
  • Zero behavior change when webhooks is empty / absent
  • User-facing docs in docs/reference/webhooks.md with payload schema, headers, and Python HMAC verification snippet

Spec: docs/superpowers/specs/2026-05-19-outbound-webhooks-design.md
Plan: docs/superpowers/plans/2026-05-19-outbound-webhooks.md

Payload

{
  "id": "f47ac10b-...",
  "event": "state_transition",
  "session_id": "abc",
  "agent": "claude",
  "from": "idle",
  "to": "waiting",
  "project_path": "/Users/foo/code/bar",
  "timestamp": "2026-05-19T14:30:00Z",
  "api": {
    "session_url": "http://127.0.0.1:7421/api/sessions/abc",
    "detail_url": "http://127.0.0.1:7421/api/sessions/abc/full"
  }
}

api.* is included only when the API server is running (uses the resolved srv.Addr()).

Config

{
  "webhooks": [
    {
      "name": "slack-needs-input",
      "url": "https://hooks.slack.com/services/T00/B00/XXX",
      "secret": "shared-with-receiver",
      "events": ["waiting"],
      "agents": ["claude", "codex"]
    }
  ]
}

events and agents empty (or absent) mean "match everything". secret is optional — when set, requests carry an HMAC-SHA256 signature.

Design choices

  • Best-effort, in-memory only — no persistence, no DLQ. Lazyagent stays read-only on disk
  • Dedup at the dispatcher instead of the bus — when TUI + API + GUI run together, each manager publishes the same transition; a 2 s (session_id, from, to) window coalesces them. Stale entries are evicted opportunistically (TTL 5 min, guarded by a 64-entry threshold) so the dedup map can't grow unbounded
  • One dispatcher per process — the tray forks, so its dispatcher is independent of the main process's. Documented in webhooks.md
  • No SSE refactor in this PR — the existing /api/events SSE pulse is untouched. A future PR could migrate it onto the new typed bus

Test plan

  • go test ./... -race passes (16 new tests across internal/core and internal/webhook)
  • go build ./... and go build -tags notray ./... succeed
  • Unit tests cover: bus publish/subscribe/drop/unsubscribe/concurrent, tracker transition emission, webhook config validation, payload marshal, filter matching (8 cases), HMAC test vector, dispatcher happy path / retry / no-retry-on-4xx / all-fail / HMAC header / dedup / graceful shutdown / lastSeen eviction
  • Manual end-to-end smoke against a real webhook receiver (not yet performed by the author — would appreciate maintainer-side verification)

Non-goals (call-outs for reviewer)

  • No CLI subcommand for managing webhooks — users edit config.json directly, consistent with the rest of the project
  • No content-based or CWD-based filters in MVP. Event + agent filters cover the 90% use case
  • Roadmap entry uses a placeholder version (v0.10); feel free to renumber at merge

Specs the first lazyagent feature with internal pub-sub: a typed EventBus
in internal/core/ that emits session activity state transitions, plus a
new internal/webhook/ dispatcher that delivers them as HTTP POSTs with
optional HMAC-SHA256 signing, event/agent filters, and async best-effort
delivery.
Seventeen-task TDD plan covering the new internal/core EventBus, the
ActivityTracker transition emission change, the WebhookConfig type
and validation, the internal/webhook package (payload, filter, HMAC
signing, dispatcher with retry/dedup/shutdown), wiring across TUI,
API, GUI and main, plus user-facing docs.
Introduces SessionEvent struct and EventBus with Subscribe/Unsubscribe/Publish.
Publish is non-blocking — full subscriber channels silently drop events.
ActivityTracker now accepts an optional EventBus via SetEventBus; when
attached, Update publishes a SessionEvent whenever a session's resolved
activity changes (including the initial Unknown→X transition on first
observation). Nil bus is safe — existing callers are unaffected.

Also anchors the WaitingGrace timer to s.LastActivity instead of the
current poll time, so sessions already past the grace window on first
observation are promoted to ActivityWaiting immediately.
The waitingSince anchor was changed to s.LastActivity in commit 4204186
as an unintended side-effect of the webhook feature branch. This reverts
it to always anchor on `now` (the original behavior), keeping the grace
period logic correct for TUI/GUI debouncing.

The transition test is updated to avoid the waiting grace period entirely:
it now exercises the Thinking→Running path via StatusExecutingTool+Bash,
which is deterministic and does not depend on clock offsets.
Implements Dispatcher that subscribes to core.EventBus, fans out
SessionEvents to matching WebhookConfigs, and delivers them via a
worker pool with proper headers (Content-Type, User-Agent,
X-Lazyagent-Event, X-Lazyagent-Delivery, X-Lazyagent-Signature).
Implement retry loop in Dispatcher.deliver using the existing backoffs
slice (default 1s/5s/30s). 5xx responses and network errors trigger
retries up to len(backoffs) times; 4xx responses are treated as
permanent and abort immediately. Extracted doOnce helper for testability.
Multiple in-process SessionManagers (TUI + API + GUI tray) can publish
the same transition. Dispatcher now coalesces duplicates within a 2s
window using a per-session lastSeen map guarded by a mutex.
Change NewModel signature to accept a *core.EventBus (nil-safe) and wire
it to the SessionManager via SetEventBus when non-nil. Update main.go
call site to pass nil; real bus will be wired in T16.
Add webhooks.md with full field reference, payload schema, request
headers, HMAC verification example, delivery semantics, and
troubleshooting. Update configuration.md with a webhooks field section,
roadmap.md with the v0.10 entry (removing the ⬜ placeholder), and
README.md with a feature bullet in the News section.
Replace version.String() (which includes the product name) with version.Version
in the User-Agent header to avoid the malformed "lazyagent/lazyagent v..." value.
Add dedupTTL (5m) to Dispatcher and opportunistic eviction in shouldDedup so the
lastSeen map doesn't grow unbounded; add TestDispatcher_LastSeenEvictsOldEntries
to verify eviction fires when the map exceeds 64 entries.
- Skip transition emission on first observation (was flooding consumers
  with synthetic events for every session present at startup)
- Disable parent webhook dispatcher when --gui is set (the tray child
  already runs one; both running causes cross-process duplicates)
- Remove non-existent /full detail URL from payload — only
  /api/sessions/{id} exists in the API
- Reject webhook URLs without a host (was accepted, then failed at
  delivery with noisy retry logs)
- Tray dispatcher now respects ServiceShutdown ctx instead of running
  detached
…ation

- Normalize wildcard bind addresses (0.0.0.0, ::, empty host) to
  127.0.0.1 when populating api.session_url, so the URL is actually
  followable by the consumer instead of being a raw bind address.
- Document that the api.* payload field is only populated when the
  dispatcher and API server run in the same process (i.e. not in --gui
  modes where the tray owns webhooks and the parent owns the API).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant