Skip to content

fix(streaming): stop raw-source bleed — playhead staleness, landing-lead feedback, slice flow control#256

Draft
leszko wants to merge 1 commit into
mainfrom
rafal/fix/raw-audio-bleed
Draft

fix(streaming): stop raw-source bleed — playhead staleness, landing-lead feedback, slice flow control#256
leszko wants to merge 1 commit into
mainfrom
rafal/fix/raw-audio-bleed

Conversation

@leszko

@leszko leszko commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Problem

"Sometimes I hear the input audio instead of the processed audio — like the playhead is ahead of the processed stream." Intermittent, random places, much worse on remote pods.

The client's loop buffer always holds the original source track; denoised slices patch over it just ahead of the playhead. So any slice landing behind the playhead means the listener hears raw input. Three independent mechanisms caused exactly that:

  1. Stale playhead anchors — the runner's playhead clock trusted every arriving playback_pos as fresh. Under congestion, reports queue; each re-anchor lands further in the past and the estimate runs away (traced: lead -5 s -> -10 s, growing ~1 s/s).
  2. Client slice-apply backlog — ~20 ms main-thread cost per slice (per-patch mirror listener fan-out + per-slice LUFS DSP). A busy or background-throttled tab applies slower than the server produces (~38/s) -> unbounded receive backlog (traced: -17 s, ~660 slices queued).
  3. Bandwidth deficit — the slice stream is ~1-2.5 MB/s of overlapping windows; a slower link (SSH/IDE tunnel) buffers tens of seconds where the server can't see it (traced: 50 s-stale reports while server-side queues read empty).

Fix

  • Staleness compensation: params.client_time send stamps; server estimates per-report queueing delay (ReportStalenessEstimator, windowed-min offset) and projects the playhead anchor forward. Server recv loop coalesces backlogged params (newest snapshot wins); client skips param ticks when ws.bufferedAmount backs up.
  • Landing-lead feedback: client reports worst slice landing lead at apply time (params.slice_lead_s); the runner's transport lead raises additively to cover deficits, decays only while reports show headroom (hysteresis hold, 3 s cap). Ignored while a loop band is armed.
  • End-to-end flow control: client acks cumulative received slice bytes (params.slice_bytes_rx); server skips slice emission pre-encode (delta mirror stays consistent) while sent-acked exceeds DEMON_SLICE_WINDOW_BYTES (default 256 KiB). Bus-queue age cap (2 s) as backstop. Old clients = legacy behavior.
  • Cheap slice apply: AudioPlayer mirror-change notifications trailing-throttled to 10 Hz; LUFS chunk map maintained only while the matcher is enabled.
  • Trace fixes: client [lat] and server lat_decode leads now fold modulo track duration (loop wrap no longer prints -57 s); lat_decode gains transport_s / staleness_s.

Wire contract: three new optional params fields (registry + regenerated TS types; drift guards green).

Validation

scripts/flow_control_harness.py (new): throttling proxy emulating a bufferbloated tunnel + headless client measuring landing leads end-to-end.

Link Result
direct leads steady +0.14...+0.23 s, zero drops (no regression)
500 KB/s +0.13...+0.45 s every 5 s bucket, incl. loop wrap
200 KB/s (near coverage floor) one ~0.5 s dip while the controller learns the link, then stable positive

pytest tests/unit: 272 passed (9 new in tests/unit/test_playhead_staleness.py); npm run typecheck clean.

Known residual

A brief (~1 s) bleed remains possible at the onset of sudden congestion while the controller adapts; eliminating it would require speculative far-ahead writes (permanent latency everywhere). Follow-ups worth tickets: ~35 s per-session model reload makes the single-session preemption footgun painful; no UI progress during session create.

🤖 Generated with Claude Code

…ead feedback, slice flow control

The client's loop buffer always holds the original source track;
denoised slices patch over it just ahead of the playhead, so any slice
landing behind the playhead plays the RAW INPUT to the listener. Three
independent mechanisms produced exactly that, intermittently ("input
audio at random places", worse on remote):

1. Stale playhead anchors. The runner's playhead clock re-anchored on
   every arriving playback_pos, treating arrival time as report time.
   Under congestion (or server recv backlog) reports queue, each anchor
   lands further in the past, and the estimate runs away (observed
   -10s and growing). Now: params carry a client_time send stamp; the
   server estimates per-report queueing delay (windowed-min offset,
   ReportStalenessEstimator) and projects the anchor forward. The recv
   loop also coalesces backlogged params (newest snapshot wins), and
   the client skips param ticks when ws.bufferedAmount backs up.

2. Client slice-apply backlog. Applying one slice cost ~20ms of main
   thread (per-patch mirror listener fan-out + per-slice LUFS chunk
   DSP); a busy or background-throttled tab applied slower than the
   server produced (~38/s), the receive queue grew without bound, and
   every patch landed behind the playhead (observed -17s). Now: mirror
   change notifications are trailing-throttled to 10 Hz and the LUFS
   chunk map is only maintained while the matcher is enabled (full
   recompute on enable).

3. Bandwidth deficit. The slice stream is ~1-2.5 MB/s of heavily
   overlapping windows; a slower link (SSH/IDE tunnel) buffers tens of
   seconds in socket/tunnel queues the server cannot see (observed
   50s-stale reports with zero server-side backlog). No lead fixes a
   link slower than the stream — the server must send less. Now:
   end-to-end flow control — the client acks cumulative received slice
   bytes (params.slice_bytes_rx); the server skips slice emission,
   BEFORE delta-encoding so the mirror chain stays consistent, while
   sent-minus-acked exceeds DEMON_SLICE_WINDOW_BYTES (default 256 KiB).
   A bus-queue age cap (2s) backstops the case where TCP itself pushes
   back. Old clients send no ack and keep legacy behavior.

Closing the loop on all three: the client reports the worst observed
slice landing lead at APPLY time (params.slice_lead_s, folded modulo
duration); the runner's transport lead rises additively to cover any
deficit and decays only while reports show headroom (hysteresis hold,
cap 3s) — covering network transit and client scheduling the server
cannot measure locally. Reports are ignored while a loop band is armed
(band wraps read as spurious negative linear leads).

Also: lat traces (client [lat] and server lat_decode) now fold lead
modulo track duration, so the loop-wrap pre-write no longer prints as
lead=-57s and real underruns stand out; lat_decode gained transport_s
and staleness_s fields.

Wire contract: params gains optional client_time / slice_lead_s /
slice_bytes_rx (registry + regenerated TS types).

Validation: scripts/flow_control_harness.py (new) — a throttling proxy
emulating a bufferbloated tunnel plus a headless client measuring
landing leads. Direct link: leads steady +0.14..0.23s (no regression).
500 KB/s: +0.13..0.45s every bucket through the loop wrap. 200 KB/s
(near the stream's coverage floor): one ~0.5s dip while the controller
learns the link, then stable positive. Unit coverage in
tests/unit/test_playhead_staleness.py (estimator, clock compensation,
transport controller).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant