Delete TURN allocation on socket close#101
Conversation
Before closing a socket, walk every relay candidate and gathering
transaction bound to it, invoke ExTURN.Client.close/1, and ship the
returned Refresh(lifetime=0) datagram on the original socket. Without
this, a TURN server (notably Cloudflare Realtime TURN) keeps the
5-tuple allocated until TTL expires; a future Allocate from the same
source port is then rejected with 437 Allocation Mismatch (RFC 5766
§6.2, RFC 8656 §6.2), gathering completes with no typ relay candidate,
and ICE fails.
Make the teardown run on abrupt parent death too. PeerConnection in
ex_webrtc 0.16 does not trap exits; if its DTLSTransport child crashes
(e.g. unifex_parse_arg when DTLS never negotiated), the linked cascade
kills ICE before PeerConnection can call ice_transport.close. Trap
exits in ICEAgent's init and propagate non-:normal EXITs as {:stop,
reason, state} so terminate/2 always runs the close path. :normal
EXITs (from short-lived children like gatherer worker processes) stay
noreply.
Transport.Mock in test support keeps closed sockets in the ETS table
with state: :closed so tests can assert what the agent sent on the
close path; setup_socket / open_ephemeral transparently reuse the slot
on re-open.
Depends on the matching ExTURN.Client.close/1 addition; pinned to that
commit via git dep until an ex_turn release ships.
Verified end-to-end against Cloudflare Realtime TURN via
ex_turn_cloudflare_repro: 20/20 iterations emit typ relay with zero
437s on narrow port-range cycling (was 0/20 without the fix, 437 on
iteration 1).
The handle_info({:EXIT, _, reason}, state) clause was uncovered: gen_server
intercepts EXITs from the parent and runs terminate/2 directly, so the
existing parent-death test never reached the clause. Add a test that links
a non-parent process to the agent and lets it exit abnormally, which is
the only path that actually drives the {:stop, reason, state} return.
Drop the case fallback in terminate/2; init/1 always returns a state map
with :ice_agent, so the _ -> :ok branch was unreachable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ExTURN.Client.close/1 emits no logs and the transport's send/4 returns
{:error, _} silently, so a failed Refresh leaves no breadcrumb — exactly
the failure mode that triggers 437 Allocation Mismatch on the next port
reuse. Surface the error at warning level instead of swallowing it.
Add a Transport.Mock.fail_send/2 hook so the test can drive a real
allocation to :allocated, force the close-path send to return :enotconn,
and assert on the captured log.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Rationale behind my changes to your branch:
Comments are welcome:) |
| defp release_turn_allocation(ice_agent, socket, client) do | ||
| with {:send, turn_addr, data, _client} <- ExTURN.Client.close(client) do | ||
| case ice_agent.transport_module.send(socket, turn_addr, data) do | ||
| :ok -> :ok | ||
| {:error, reason} -> Logger.debug("Couldn't send deallocate request, reason: #{reason}") | ||
| end | ||
| end | ||
| end |
There was a problem hiding this comment.
NITPICK
It looks nice (if ExTURN.close/1 returns {:ok, state} it automatically returns), but on the other hand it would be better to always return the same value.
| defp release_turn_allocation(ice_agent, socket, client) do | |
| with {:send, turn_addr, data, _client} <- ExTURN.Client.close(client) do | |
| case ice_agent.transport_module.send(socket, turn_addr, data) do | |
| :ok -> :ok | |
| {:error, reason} -> Logger.debug("Couldn't send deallocate request, reason: #{reason}") | |
| end | |
| end | |
| end | |
| defp release_turn_allocation(ice_agent, socket, client) do | |
| with {:send, turn_addr, data, _client} <- ExTURN.Client.close(client) do | |
| :ok <- ice_agent.transport_module.send(socket, turn_addr, data) do | |
| :ok | |
| else | |
| {:ok, _state} -> :ok | |
| {:error, reason} -> Logger.debug("Couldn't send deallocate request, reason: #{reason}") | |
| end | |
| end | |
| end |
| with {:send, turn_addr, data, _client} <- ExTURN.Client.close(client) do | ||
| case ice_agent.transport_module.send(socket, turn_addr, data) do | ||
| :ok -> :ok | ||
| {:error, reason} -> Logger.debug("Couldn't send deallocate request, reason: #{reason}") |
There was a problem hiding this comment.
If that can cause problems for the user, maybe info would be better.
There was a problem hiding this comment.
I don't think there's anything the user can do in this case, as this most likely suggests a network issue. I'd leave it at debug
Karolk99
left a comment
There was a problem hiding this comment.
We should look more closely at whether we need to close TURN allocations in cases when handle_terminate won't be called (when PeerConnection is closed with a different reason than :normal)
This reverts commit be2d6a5.
Reverted back to trapping exits, now allocations are released during shutdowns and parent crashes, too |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #101 +/- ##
==========================================
+ Coverage 86.86% 87.08% +0.22%
==========================================
Files 27 27
Lines 2078 2091 +13
==========================================
+ Hits 1805 1821 +16
+ Misses 273 270 -3
... and 1 file with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
Use
ExTURN.Client.close/1to sendRefresh(lifetime=0)and delete the present allocation.Resolves #100
ref: elixir-webrtc/ex_turn#10