Describe the Bug
We’re seeing a runtime panic coming from the WebSocket client when the connection is being reused / reconnected:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0xc0 pc=0xc1e197]
goroutine 1289578 [running]:
github.com/gorilla/websocket.(*Conn).NextReader(0x0)
/go/pkg/mod/github.com/gorilla/websocket@v1.5.3/conn.go:1000 +0x17
github.com/gorilla/websocket.(*Conn).ReadJSON(0xf77860?, {0xe9d720, 0xc002bb5d40})
/go/pkg/mod/github.com/gorilla/websocket@v1.5.3/json.go:50 +0x25
github.com/fosrl/newt/websocket.(*Client).readPumpWithDisconnectDetection(...)
/app/websocket/client.go:681 +0x15c
Note the receiver for (*websocket.Conn).NextReader is 0x0, so the panic occurs because we’re calling ReadJSON on a nil *websocket.Conn.
Relevant code paths
This means there’s a race where reconnect sets c.conn to nil between loop iterations in readPumpWithDisconnectDetection or pingMonitor. On the next iteration those goroutines call methods on c.conn and we end up in (*Conn).NextReader(0x0) → nil pointer dereference.
Proposed direction
For now, the minimal fix is simply not to assign c.conn = nil while reader/writer goroutines are still active. Closing the underlying *websocket.Conn is enough to cause ReadJSON / WriteControl to return an error, which we already handle with reconnect logic. The extra nil assignment is what turns a clean error into a panic.
or
If you wish to keep the current setting to nil then before calling READJSON we must check if the connection is nil and return but this may cause duplicate re connection attempts.
Environment
- OS Type & Version: (e.g., Ubuntu 22.04)
- Pangolin Version:
- Gerbil Version:
- Traefik Version:
- Newt Version:
- Olm Version: (if applicable)
To Reproduce
Simply a race condition and hard to replicate
Expected Behavior
Panic should never occur during runtime
Describe the Bug
We’re seeing a runtime panic coming from the WebSocket client when the connection is being reused / reconnected:
Note the receiver for
(*websocket.Conn).NextReaderis0x0, so the panic occurs because we’re callingReadJSONon anil*websocket.Conn.Relevant code paths
readPumpWithDisconnectDetectioncalls:pingMonitorcalls:reconnect(and related shutdown logic) closes the current connection and setsc.conn = nilbefore/while these goroutines are still running.This means there’s a race where
reconnectsetsc.conntonilbetween loop iterations inreadPumpWithDisconnectDetectionorpingMonitor. On the next iteration those goroutines call methods onc.connand we end up in(*Conn).NextReader(0x0)→ nil pointer dereference.Proposed direction
For now, the minimal fix is simply not to assign
c.conn = nilwhile reader/writer goroutines are still active. Closing the underlying*websocket.Connis enough to causeReadJSON/WriteControlto return an error, which we already handle with reconnect logic. The extranilassignment is what turns a clean error into a panic.or
If you wish to keep the current setting to
nilthen before callingREADJSONwe must check if the connection isniland return but this may cause duplicate re connection attempts.Environment
To Reproduce
Simply a race condition and hard to replicate
Expected Behavior
Panic should never occur during runtime