-
Notifications
You must be signed in to change notification settings - Fork 692
proxy-using role only has 10 seconds to shut down? #1768
Description
Hi, we're deploying a Rails app via kamal that is running solid_queue in the web app container via the puma plugin, as per rails default.
When we increase solid_queue.shutdown_timeout to 10 seconds or more, we get errors from solid queue because it loses communication with its workers before they are confirmed shut down.
[SolidQueue::Processes::ProcessPrunedError: Process was found dead and pruned (last heartbeat at: 2026-01-30 19:18:29 UTC)](https://***/jobs/applications/clockwork/jobs/161be8a8-5563-4aae-9a7a-948ed39f2766?server_id=solid_queue#error)
So it seems that solid_queue only has 10 seconds to shut down, which is not surprising since the drain timeout of 30 seconds (by default) is not applied to proxied roles.
@djmb seems to argue in #1372 (comment) that we don't need to apply the drain timeout to proxied roles since we're already waiting for up to that amount of time until the web requests have drained. However, I believe that no signal is sent to the application until the web requests are fully drained, at which point the application only has docker's default of 10 seconds to shut down.
So, we can't increase solid_queue's shutdown timeout, it seems, in the Rails default setup.
Solutions I can think of:
- Change Rails' defaults to run jobs in a separate role (out of scope for kamal, but I don't think this will fly because of increased memory usage)
- Change Rails default kamal config to pass an explicit
stop-timeoutto docker (I'm not even sure this is possible) - Change kamal to send a SIGTERM to the application when starting to drain requests (I think this will abort request processing in flight, and is thus also not an option)
- Change kamal to apply
drain_timeoutto all roles - Add a config option to explicitly apply the drain timeout to a role even if it is proxied
Please correct me where I'm wrong.
For now, I will experiment with passing an explicit stop-timeout to docker through kamal