gthread worker deadlocks during startup in 25.1.0 (regression from 25.0.3) #3509

brianfromm · 2026-02-17T00:36:05Z

brianfromm
Feb 17, 2026

Type

Bug Report

Description

The gthread worker deadlocks during initialization in Gunicorn 25.1.0. The master process starts normally, but the worker process never finishes booting — the expected "Booting worker with pid" message never appears. The worker has only 1 thread (instead of the expected 5 for gthread with --threads 4), is blocked on futex_do_wait, and never calls accept(). Because the master has bound the socket, incoming TCP connections time out rather than being refused, making the server appear running but completely unresponsive.

This is a regression. The same application on the same Python version runs without issues on 25.0.3.

Steps to Reproduce (for bugs)

Create a Flask WSGI app that starts background threads at module load time:
- A daemon threading.Thread (worker loop)
- A threading.Timer (scheduled cleanup task)
Run with: gunicorn --workers 1 --threads 4 --bind 0.0.0.0:8080 --log-level info --timeout 0 server:app
The master logs "Listening at" and "Using worker: gthread" but "Booting worker with pid" never appears
All HTTP requests to the server time out

Reproduced at 100% rate across 2 consecutive container starts (automated via Watchtower). A manual docker compose restart resolves the deadlock each time — the new worker boots successfully.

Notably, within the same stuck container:

python3.14 -c "import server" completes fine (module loads, threads start normally)
Launching a fresh Gunicorn on an alternate port works immediately and serves requests

Configuration

gunicorn \
  --workers 1 \
  --threads 4 \
  --bind 0.0.0.0:8080 \
  --log-level info \
  --timeout 0 \
  --forwarded-allow-ips 127.0.0.1 \
  server:app

Logs / Error Output

# Full Gunicorn output (nothing else is ever logged):
Starting Gunicorn server...
[2026-02-16 21:02:31 +0000] [7] [INFO] Starting gunicorn 25.1.0
[2026-02-16 21:02:31 +0000] [7] [INFO] Listening at: http://0.0.0.0:8080 (7)
[2026-02-16 21:02:31 +0000] [7] [INFO] Using worker: gthread
[2026-02-16 21:02:31 +0000] [7] [INFO] Control socket listening at /app/gunicorn.ctl

# "Booting worker with pid: <N>" never appears

# Process inspection:
# PID 7: gunicorn master
# PID 9: gunicorn worker — 1 thread (should be 5), /proc/9/wchan = futex_do_wait

# curl from inside the container:
$ curl -v --max-time 5 http://127.0.0.1:8080/health
* Trying [::1]:8080...
* connect to ::1 port 8080 failed: Connection refused
* Trying 127.0.0.1:8080...
* Connection timed out after 5001 milliseconds

Gunicorn Version

25.1.0

Python Version

3.14.3

Worker Class

gthread

Operating System

Ubuntu Server 24.04 in Docker

Additional Context

The WSGI app (Flask) creates two background threads during module import:

A daemon threading.Thread (task worker loop)
A threading.Timer for scheduled file cleanup (non-daemon)

This pattern has worked on all prior Gunicorn versions. The thread creation during module import may interact poorly with changes in 25.1.0's gthread worker initialization sequence.

Workaround: pin to gunicorn==25.0.3.

Checklist

I have searched existing discussions and issues for duplicates
I have checked the documentation and FAQ

brianfromm · 2026-02-17T00:48:58Z

brianfromm
Feb 17, 2026
Author

Root Cause Analysis (from Claude Code)

After diffing the package contents of 25.0.3 and 25.1.0, the issue appears to be a fork-after-thread deadlock introduced by the new Control Socket feature.

What changed in 25.1.0

In arbiter.py, setup() now calls _start_control_server() before workers are forked:

# arbiter.setup() — runs BEFORE workers are spawned:
self._start_control_server()    # NEW in 25.1.0
self.cfg.when_ready(self)

_start_control_server() spins up a background thread running an asyncio event loop in the master process:

# ctl/server.py
def start(self):
    self._running = True
    self._thread = threading.Thread(target=self._run_loop, daemon=True)
    self._thread.start()

def _run_loop(self):
    asyncio.run(self._serve())  # asyncio event loop in a thread

Why this causes the deadlock

This is the classic problem described in Python's os.fork() docs: "safely forking a multithreaded process is problematic."

The sequence:

1. Master calls _start_control_server() → starts a background thread with an asyncio event loop (which acquires internal mutexes/locks)
2. Master forks the worker process
3. The child inherits all mutex/lock state, but not the threads that hold them
4. When the gthread worker tries to initialize its thread pool, it attempts to acquire a lock that was held by the now-nonexistent control socket thread → futex deadlock

This explains all observed behavior

  ┌─────────────────────────────────────────────┬──────────────────────────────────────────────────────────────────────────────────────────────┐
  │                 Observation                 │                                         Explanation                                          │
  ├─────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┤
  │ Worker stuck on futex_do_wait with 1 thread │ Deadlocked before creating gthread thread pool                                               │
  ├─────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┤
  │ No "Booting worker with pid" log            │ Worker never finished initialization                                                         │
  ├─────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┤
  │ 25.0.3 worked fine                          │ No pre-fork threads existed                                                                  │
  ├─────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┤
  │ Fresh Gunicorn inside stuck container works │ New process, no inherited lock state                                                         │
  ├─────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┤
  │ Restart sometimes recovers                  │ Timing-dependent — if fork happens before asyncio acquires certain locks, the child survives │
  └─────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────┘

Suggested fix

The control socket server should be started after workers are forked, not during setup().

Workaround

Either pin to gunicorn==25.0.3 or use --no-control-socket to disable the feature.

0 replies

vytas7 · 2026-02-25T21:28:27Z

vytas7
Feb 25, 2026

I think we started experiencing something like that using Gevent workers and a single process. Sometimes Gunicorn fails to start upon booting a cloudinit instance/server.

0 replies

fcharlier · 2026-02-26T10:13:11Z

fcharlier
Feb 26, 2026

Hi, we're having a similar issue using Gevent workers and a single process/worker also.
This was blocking our continuous integration jobs with Gunicorn failing to boot worker process 9 times out of 10.

Using --no-control-socket fixed the issue.

0 replies

benoitc · 2026-02-26T10:17:04Z

benoitc
Feb 26, 2026
Maintainer

did you tested #3520 ?

1 reply

fcharlier Feb 26, 2026

@benoitc I just did.

Using 25.1.0, no worker is booted.

Logs:

  control_socket: gunicorn.ctl
  control_socket_mode: 384
  control_socket_disable: False
[2026-02-26 11:09:49 +0000] [1] [INFO] Starting gunicorn 25.1.0
[2026-02-26 11:09:49 +0000] [1] [DEBUG] Arbiter booted
[2026-02-26 11:09:49 +0000] [1] [INFO] Listening at: http://0.0.0.0:5003/ (1)
[2026-02-26 11:09:49 +0000] [1] [INFO] Using worker: gevent
[2026-02-26 11:09:49 +0000] [1] [INFO] Control socket listening at /opt/rhdl-feeder/gunicorn.ctl
[2026-02-26 11:09:49 +0000] [1] [DEBUG] 1 workers

And that's all, no other output.

Using #3520, it works as expected.

Logs:

  control_socket: /tmp/gunicorn.ctl
  control_socket_mode: 384
  control_socket_disable: False
[2026-02-26 11:01:08 +0000] [1] [INFO] Starting gunicorn 25.1.0
[2026-02-26 11:01:08 +0000] [1] [DEBUG] Arbiter booted
[2026-02-26 11:01:08 +0000] [1] [INFO] Listening at: http://0.0.0.0:5003/ (1)
[2026-02-26 11:01:08 +0000] [1] [INFO] Using worker: gevent
[2026-02-26 11:01:08 +0000] [1] [INFO] Control socket listening at /tmp/gunicorn.ctl
[2026-02-26 11:01:08 +0000] [3] [INFO] Booting worker with pid: 3
[2026-02-26 11:01:08 +0000] [1] [INFO] Control socket listening at /tmp/gunicorn.ctl
[2026-02-26 11:01:09 +0000] [1] [DEBUG] 1 workers
[2026-02-26 11:01:10,694] [3 MainThread] [INFO] app.app - Dumping settings:
[… more logs]

And app logs + accesslogs continue

dpjanes · 2026-03-24T11:15:07Z

dpjanes
Mar 24, 2026

The good news is I have an explanation of why I've aged 2 years in the last two weeks, staring at logs and reconfiguring everything!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gthread worker deadlocks during startup in 25.1.0 (regression from 25.0.3) #3509

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

gthread worker deadlocks during startup in 25.1.0 (regression from 25.0.3) #3509

Uh oh!

brianfromm Feb 17, 2026

Type

Description

Steps to Reproduce (for bugs)

Configuration

Logs / Error Output

Gunicorn Version

Python Version

Worker Class

Operating System

Additional Context

Checklist

Replies: 5 comments · 1 reply

Uh oh!

brianfromm Feb 17, 2026 Author

Root Cause Analysis (from Claude Code)

What changed in 25.1.0

Uh oh!

vytas7 Feb 25, 2026

Uh oh!

fcharlier Feb 26, 2026

Uh oh!

benoitc Feb 26, 2026 Maintainer

Uh oh!

fcharlier Feb 26, 2026

Uh oh!

dpjanes Mar 24, 2026

brianfromm
Feb 17, 2026

Replies: 5 comments 1 reply

brianfromm
Feb 17, 2026
Author

vytas7
Feb 25, 2026

fcharlier
Feb 26, 2026

benoitc
Feb 26, 2026
Maintainer

dpjanes
Mar 24, 2026