Fix sockets not getting an ephemeral port between bind() and listen()#6478
Fix sockets not getting an ephemeral port between bind() and listen()#6478
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds infrastructure and regression tests to ensure TCP bind(port=0) assigns an ephemeral port immediately (and preserves it across listen() / connect()), fixing the previously observed “port stays 0 until later” behavior.
Changes:
- Introduces a “bound TCP” state (
VirtualTcpBoundSocket) and a newbind_tcp()API invirtual-net, plus remote protocol support. - Updates WASIX socket state machine to materialize TCP binding earlier (so
getsockname()afterbind()reports the real ephemeral port). - Adds WASM/C and Rust tests covering port assignment stability and TTL passthrough for bound sockets.
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| lib/wasix/tests/wasm_tests/socket_tests/bind-port-zero/main.c | New WASM test asserting ephemeral port is allocated after bind() and stable after listen(). |
| lib/wasix/tests/wasm_tests/socket_tests/bind-port-zero/build.sh | Build script for the new WASM test. |
| lib/wasix/tests/wasm_tests/socket_tests/bind-port-zero-connect/main.c | New WASM test asserting client port allocated after bind() is stable after connect(). |
| lib/wasix/tests/wasm_tests/socket_tests/bind-port-zero-connect/build.sh | Build script for the new WASM test. |
| lib/wasix/tests/wasm_tests/socket_tests.rs | Hooks new WASM socket tests into the Rust test suite. |
| lib/wasix/src/syscalls/wasix/sock_bind.rs | Journals the effective bound address instead of the requested address. |
| lib/wasix/src/net/socket.rs | Adds BoundTcp inode state and updates bind/listen/connect/status/addr logic accordingly; adds unit tests. |
| lib/virtual-net/src/tests.rs | Adds bind_tcp() / bound-socket regression tests for local, loopback, and remote backends. |
| lib/virtual-net/src/server.rs | Adds remote protocol handling for BindTcp, ListenBound, ConnectBound, and improves Close. |
| lib/virtual-net/src/meta.rs | Extends remote request protocol with BindTcp, ListenBound, and ConnectBound. |
| lib/virtual-net/src/loopback.rs | Implements loopback bind_tcp() with ephemeral port reservation + TTL preservation across listen/connect. |
| lib/virtual-net/src/lib.rs | Adds the VirtualTcpBoundSocket trait and VirtualNetworking::bind_tcp() API. |
| lib/virtual-net/src/host.rs | Implements bind_tcp() on LocalNetworking using socket2 and introduces LocalTcpBoundSocket. |
| lib/virtual-net/src/client.rs | Implements remote client bind_tcp() and adds bound-socket transitions/ownership handling. |
Comments suppressed due to low confidence (1)
lib/virtual-net/src/loopback.rs:6
LoopbackTcpBoundSocket.local_addrcan be returned as0.0.0.0:<port>/::<port>even though the code normalizes unspecified IPs for reservation keys. This is a behavior regression from the priorlisten_tcpnormalization (where the listener address was rewritten to localhost) and can break callers that expect a concrete loopback address fromaddr_local(). Consider normalizing the returned/stored bound address as well (e.g., makeallocate_tcp_bind_addr()return a normalized address, or setlocal_addr: Self::normalize_listener_addr(addr)), soaddr_local()is consistent with the reservation key and prior behavior.
|
/patchsmith test |
3 similar comments
|
/patchsmith test |
|
/patchsmith test |
|
/patchsmith test |
|
Python upstream result: IMPROVEMENT Baseline: main @ 9a850b5 Summary:
See full details at commit. Top improvements:
More info:
|
theduke
left a comment
There was a problem hiding this comment.
AI Review findings.
Both are valid.
Should add C tests for all that behaviour where possible!
- High: Failed TCP bind() leaves the socket logically bound inside Wasix. In lib/wasix/src/net/socket.rs:341 the code writes addr.replace(set_addr) before net.bind_tcp(...) runs, and on any error
path it returns without rolling that back. After that, lib/wasix/src/net/socket.rs:817 will report the requested address even though the host bind failed, and later listen()/connect() calls will
reuse that stale local address. Linux keeps the socket unbound after a failed bind(2).- Medium: Bound TCP sockets can never report write readiness through poll/epoll. lib/wasix/src/net/socket.rs:1706 hardcodes InodeSocketKind::BoundTcp to Poll::Pending, and the fd polling path
consumes that in lib/wasix/src/fs/inode_guard.rs:288. That is not Linux-aligned: a TCP socket is writable immediately after a successful bind(). I verified that on the host with select([], [sock],
[], 0), which returned the bound socket as writable. The code is also internally inconsistent because lib/wasix/src/net/socket.rs:1652 already treats BoundTcp as writable. - Medium: The loopback backend releases the bound local port too early after connect(), and the new test suite codifies that behavior. lib/virtual-net/src/loopback.rs:386 drops the reservation
immediately after creating the connection, and lib/virtual-net/src/tests.rs:914 asserts that rebinding the same local port should succeed. That is not POSIX/Linux behavior: I verified locally that
rebinding the connected client’s port returns EADDRINUSE. As written, loopback will hide real port-collision bugs.
- Medium: Bound TCP sockets can never report write readiness through poll/epoll. lib/wasix/src/net/socket.rs:1706 hardcodes InodeSocketKind::BoundTcp to Poll::Pending, and the fd polling path
|
@Arshia001 is it ready for another review round? |
Co-authored-by: Copilot <copilot@github.com>
|
@Arshia001 needs conflict resolution. |
| }, | ||
| 0, | ||
| ), | ||
| InodeSocketKind::BoundTcp { props, .. } => SocketAddr::new( |
There was a problem hiding this comment.
A TCP socket that has been bound but not connected should not report a synthetic peer address. POSIX getpeername on an unconnected TCP socket fails with ENOTCONN; returning 0.0.0.0:0/[::]:0 here makes a successfully bound-but-unconnected socket look like it has a peer and can hide application bugs.
| Err(NetworkError::Unsupported) => { | ||
| // Fallback for backends that still only materialize TCP state at | ||
| // listen/connect time. | ||
| Ok(None) |
There was a problem hiding this comment.
Open question: should sock_bind(..., port 0) silently keep the old delayed-bind behavior when a backend returns Unsupported here? The core fix depends on bind being able to report the effective ephemeral port immediately, so falling back to Ok(None) means those backends can still return port 0 from getsockname after bind.
| // matching Linux select()/poll() semantics. | ||
| InodeSocketKind::BoundTcp { .. } => Poll::Ready(Ok(0)), |
| }, | ||
| _ => Err(NetworkError::Unsupported), | ||
| }, | ||
| _ => Err(NetworkError::Unsupported), |
| }, | ||
| _ => Err(NetworkError::Unsupported), | ||
| }, | ||
| _ => Err(NetworkError::Unsupported), |
Duplicate of #6473, from a wasmerio branch as requested by @artemyarulin.