fix(streaming): stop raw-source bleed — playhead staleness, landing-lead feedback, slice flow control#256
Draft
leszko wants to merge 1 commit into
Draft
fix(streaming): stop raw-source bleed — playhead staleness, landing-lead feedback, slice flow control#256leszko wants to merge 1 commit into
leszko wants to merge 1 commit into
Conversation
…ead feedback, slice flow control
The client's loop buffer always holds the original source track;
denoised slices patch over it just ahead of the playhead, so any slice
landing behind the playhead plays the RAW INPUT to the listener. Three
independent mechanisms produced exactly that, intermittently ("input
audio at random places", worse on remote):
1. Stale playhead anchors. The runner's playhead clock re-anchored on
every arriving playback_pos, treating arrival time as report time.
Under congestion (or server recv backlog) reports queue, each anchor
lands further in the past, and the estimate runs away (observed
-10s and growing). Now: params carry a client_time send stamp; the
server estimates per-report queueing delay (windowed-min offset,
ReportStalenessEstimator) and projects the anchor forward. The recv
loop also coalesces backlogged params (newest snapshot wins), and
the client skips param ticks when ws.bufferedAmount backs up.
2. Client slice-apply backlog. Applying one slice cost ~20ms of main
thread (per-patch mirror listener fan-out + per-slice LUFS chunk
DSP); a busy or background-throttled tab applied slower than the
server produced (~38/s), the receive queue grew without bound, and
every patch landed behind the playhead (observed -17s). Now: mirror
change notifications are trailing-throttled to 10 Hz and the LUFS
chunk map is only maintained while the matcher is enabled (full
recompute on enable).
3. Bandwidth deficit. The slice stream is ~1-2.5 MB/s of heavily
overlapping windows; a slower link (SSH/IDE tunnel) buffers tens of
seconds in socket/tunnel queues the server cannot see (observed
50s-stale reports with zero server-side backlog). No lead fixes a
link slower than the stream — the server must send less. Now:
end-to-end flow control — the client acks cumulative received slice
bytes (params.slice_bytes_rx); the server skips slice emission,
BEFORE delta-encoding so the mirror chain stays consistent, while
sent-minus-acked exceeds DEMON_SLICE_WINDOW_BYTES (default 256 KiB).
A bus-queue age cap (2s) backstops the case where TCP itself pushes
back. Old clients send no ack and keep legacy behavior.
Closing the loop on all three: the client reports the worst observed
slice landing lead at APPLY time (params.slice_lead_s, folded modulo
duration); the runner's transport lead rises additively to cover any
deficit and decays only while reports show headroom (hysteresis hold,
cap 3s) — covering network transit and client scheduling the server
cannot measure locally. Reports are ignored while a loop band is armed
(band wraps read as spurious negative linear leads).
Also: lat traces (client [lat] and server lat_decode) now fold lead
modulo track duration, so the loop-wrap pre-write no longer prints as
lead=-57s and real underruns stand out; lat_decode gained transport_s
and staleness_s fields.
Wire contract: params gains optional client_time / slice_lead_s /
slice_bytes_rx (registry + regenerated TS types).
Validation: scripts/flow_control_harness.py (new) — a throttling proxy
emulating a bufferbloated tunnel plus a headless client measuring
landing leads. Direct link: leads steady +0.14..0.23s (no regression).
500 KB/s: +0.13..0.45s every bucket through the loop wrap. 200 KB/s
(near the stream's coverage floor): one ~0.5s dip while the controller
learns the link, then stable positive. Unit coverage in
tests/unit/test_playhead_staleness.py (estimator, clock compensation,
transport controller).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
"Sometimes I hear the input audio instead of the processed audio — like the playhead is ahead of the processed stream." Intermittent, random places, much worse on remote pods.
The client's loop buffer always holds the original source track; denoised slices patch over it just ahead of the playhead. So any slice landing behind the playhead means the listener hears raw input. Three independent mechanisms caused exactly that:
playback_posas fresh. Under congestion, reports queue; each re-anchor lands further in the past and the estimate runs away (traced: lead -5 s -> -10 s, growing ~1 s/s).Fix
params.client_timesend stamps; server estimates per-report queueing delay (ReportStalenessEstimator, windowed-min offset) and projects the playhead anchor forward. Server recv loop coalesces backlogged params (newest snapshot wins); client skips param ticks whenws.bufferedAmountbacks up.params.slice_lead_s); the runner's transport lead raises additively to cover deficits, decays only while reports show headroom (hysteresis hold, 3 s cap). Ignored while a loop band is armed.params.slice_bytes_rx); server skips slice emission pre-encode (delta mirror stays consistent) while sent-acked exceedsDEMON_SLICE_WINDOW_BYTES(default 256 KiB). Bus-queue age cap (2 s) as backstop. Old clients = legacy behavior.[lat]and serverlat_decodeleads now fold modulo track duration (loop wrap no longer prints -57 s);lat_decodegainstransport_s/staleness_s.Wire contract: three new optional
paramsfields (registry + regenerated TS types; drift guards green).Validation
scripts/flow_control_harness.py(new): throttling proxy emulating a bufferbloated tunnel + headless client measuring landing leads end-to-end.pytest tests/unit: 272 passed (9 new intests/unit/test_playhead_staleness.py);npm run typecheckclean.Known residual
A brief (~1 s) bleed remains possible at the onset of sudden congestion while the controller adapts; eliminating it would require speculative far-ahead writes (permanent latency everywhere). Follow-ups worth tickets: ~35 s per-session model reload makes the single-session preemption footgun painful; no UI progress during session create.
🤖 Generated with Claude Code