Description
Starting from SDK version 2.3.0, on_playback_audio_frame_before_mixing callback never fires in a Linux Server SDK environment (no physical audio device). This is a regression — versions <= 2.2.4 work correctly.
Root Cause (from our investigation)
In SDK >= 2.3.0, RtcConnectionImpl::initializeEx forcefully overrides enableAudioRecordingOrPlayout to 0 when enableAudioDevice=0:
// What we observe in agoraapi.log:
AgoraService::createRtcConnection(cfg:(enableAudioRecordingOrPlayout:1, ...)) // Python passes 1 ✓
RtcConnectionImpl::initializeEx(cfg:(enableAudioRecordingOrPlayout:0, ...)) // C++ overrides to 0 ✗
This disables the audio playback pipeline entirely, so on_playback_audio_frame_before_mixing is never called — even though onUserAudioTrackSubscribed fires successfully (audio data arrives but is not delivered to the callback).
Environment
- OS: Ubuntu 22.04 (Docker, x86_64) / also tested on macOS ARM64
- Python: 3.10 / 3.12
- SDK versions tested:
- 2.1.0 ~ 2.2.4: Working (receives ~770 audio frames in 8 seconds)
- 2.3.0 ~ 2.4.6: Broken (0 audio frames received)
Reproduction Steps
from agora.rtc.agora_service import AgoraServiceConfig, AgoraService
from agora.rtc.agora_base import *
from agora.rtc.audio_frame_observer import IAudioFrameObserver
class AudioObs(IAudioFrameObserver):
def __init__(self): self.count = 0
def on_playback_audio_frame_before_mixing(self, lu, ch, uid, frame, vs, vb):
self.count += 1
return 1
# Initialize (standard Server SDK setup)
svc = AgoraService()
cfg = AgoraServiceConfig()
cfg.appid = "YOUR_APP_ID"
cfg.log_path = "/tmp/agora"
# enable_audio_device defaults to 0 (no audio hardware on server)
svc.initialize(cfg)
# Create connection
con_config = RTCConnConfig()
con_config.auto_subscribe_audio = 1
con_config.enable_audio_recording_or_playout = 1 # explicitly set to 1
pub_config = RtcConnectionPublishConfig()
pub_config.is_publish_audio = False
conn = svc.create_rtc_connection(con_config, pub_config)
# Register observer
lu = conn.get_local_user()
lu.set_playback_audio_frame_before_mixing_parameters(1, 16000)
ao = AudioObs()
lu._register_audio_frame_observer(ao, 0, None)
lu.subscribe_all_audio()
conn.connect(TOKEN, CHANNEL, UID)
time.sleep(10) # wait while another user sends audio in the channel
print(f"Audio frames received: {ao.count}") # Always 0 on SDK >= 2.3.0
Expected Behavior
Server SDK should deliver audio frames via on_playback_audio_frame_before_mixing even without a physical audio device (enable_audio_device=0). This worked correctly in SDK <= 2.2.4.
Workarounds
| Workaround |
Limitation |
Set enable_audio_device=1 |
Works on macOS (has CoreAudio), Segfaults on Linux containers (no ALSA/hardware) |
Use modprobe snd-dummy on Linux |
Only works on VMs/physical machines with kernel module access, not standard Docker |
| Downgrade to SDK <= 2.2.4 |
Loses new features (bytes data stream, AI QoS, etc.) |
Suspected Commit
The regression was introduced in the 2.3.0 release cycle, likely related to:
838f49b — "merge from dev to main for aiqos, version 2.3.0"
6c65a9e — "Add: add publish and connection configure in init(), modify connection"
The native C++ SDK upgrade for AI QoS appears to have added the logic that ties enableAudioRecordingOrPlayout to enableAudioDevice.
Suggested Fix
When enableAudioDevice=0, the SDK should still allow enableAudioRecordingOrPlayout=1 to enable the audio callback pipeline (without initializing hardware drivers). The audio frame observer should receive decoded PCM frames regardless of whether a physical playback device exists — this is the core Server SDK use case (AI speech processing, recording, etc.).
Description
Starting from SDK version 2.3.0,
on_playback_audio_frame_before_mixingcallback never fires in a Linux Server SDK environment (no physical audio device). This is a regression — versions <= 2.2.4 work correctly.Root Cause (from our investigation)
In SDK >= 2.3.0,
RtcConnectionImpl::initializeExforcefully overridesenableAudioRecordingOrPlayoutto0whenenableAudioDevice=0:This disables the audio playback pipeline entirely, so
on_playback_audio_frame_before_mixingis never called — even thoughonUserAudioTrackSubscribedfires successfully (audio data arrives but is not delivered to the callback).Environment
Reproduction Steps
Expected Behavior
Server SDK should deliver audio frames via
on_playback_audio_frame_before_mixingeven without a physical audio device (enable_audio_device=0). This worked correctly in SDK <= 2.2.4.Workarounds
enable_audio_device=1modprobe snd-dummyon LinuxSuspected Commit
The regression was introduced in the 2.3.0 release cycle, likely related to:
838f49b— "merge from dev to main for aiqos, version 2.3.0"6c65a9e— "Add: add publish and connection configure in init(), modify connection"The native C++ SDK upgrade for AI QoS appears to have added the logic that ties
enableAudioRecordingOrPlayouttoenableAudioDevice.Suggested Fix
When
enableAudioDevice=0, the SDK should still allowenableAudioRecordingOrPlayout=1to enable the audio callback pipeline (without initializing hardware drivers). The audio frame observer should receive decoded PCM frames regardless of whether a physical playback device exists — this is the core Server SDK use case (AI speech processing, recording, etc.).