Update Broker to RabbitMQ version 4.1.8#4068
Conversation
RabbitMQ Broker Upgrade Impact Report: 4.1.0 → 4.1.8Branch: SummaryThis is a patch-series upgrade from Breaking ChangesNone. The breaking changes introduced in 4.1.0 Breaking Changes (already resolved on
|
| Change | Impact on Lagoon | Status |
|---|---|---|
amqplib must be >= 0.10.7 due to increased initial AMQP 0-9-1 frame size (4096 → 8192 bytes) |
All Node.js packages (commons, webhook-handler, webhooks2tasks, api) already pin amqplib: "^0.10.7" |
✅ Resolved |
| Management API "one true health check" is now a no-op | Lagoon uses TCP wait-for checks, not the management health endpoint |
✅ No impact |
rabbitmqctl force_reset deprecated (Khepri-incompatible) |
Single-node local dev only; not used in production workflows | ✅ No impact |
| Default MQTT Maximum Packet Size reduced from 256 MiB to 16 MiB | Lagoon does not use the MQTT plugin | ✅ No impact |
| OAuth 2 plugin requires explicit provider configuration (no defaults for Azure Entra, auth0) | Lagoon does not use the RabbitMQ OAuth 2 plugin | ✅ No impact |
Relevant Bug Fixes (4.1.1–4.1.8)
High Relevance
| Release | Fix | Why it matters to Lagoon |
|---|---|---|
| 4.1.5 | Classic queues could run into a rare message store exception resulting in loss of a few messages when a message was routed to multiple queues | Lagoon routes messages across multiple consumers via lagoon-actions, lagoon-webhooks, and lagoon-logs direct exchanges — all backed by classic durable queues |
| 4.1.5 | Messages routed to quorum queues during or immediately before a network partition were not re-republished internally in some cases | Relevant for production cluster deployments |
| 4.1.1 | Classic queue message store compaction could fall behind under very busy publishers | Relevant under high build/deploy load |
| 4.1.1 | Classic queue message store could run into a rare exception when a message was routed to multiple queues | Direct relevance to Lagoon's fanout publish patterns |
Medium Relevance
| Release | Fix | Why it matters to Lagoon |
|---|---|---|
| 4.1.6 | Feature flag state in the registry and on disk were not consistent during node boot | broker-job.sh enables feature flags at startup — inconsistency here could cause flaky startup behaviour |
| 4.1.6 | Enabling khepri_db feature flag while the Log Exchange was enabled could cause a node to run out of memory and crash |
Relevant if Khepri migration is ever attempted |
| 4.1.8 | Default queue type handling now more defensive — avoids PRECONDITION_FAILED when no DQT is set on a vhost |
Prevents unexpected errors on first-run or fresh DB scenarios |
| 4.1.8 | When a client that owns an exclusive queue disconnects and immediately reconnects and redeclares the same queue, the node could delete the new queue | Relevant to services that use auto-reconnect logic (all of Lagoon's Go and Node.js services do) |
| 4.1.3 | Quorum queue file descriptor leak fix | Relevant if Lagoon ever migrates to quorum queues |
| 4.1.2 | Channels consuming from quorum queues could leak file handles when those queues were deleted | As above |
Low Relevance
| Release | Fix |
|---|---|
| 4.1.1 | Quorum queue failed to recover from rare timeout during cluster formation |
| 4.1.1 | Private key password could appear in certain exceptions at failed boot |
| 4.1.2 | Higher-priority SAC consumer was never activated in certain requeue scenarios |
| 4.1.4 | Import of definition files containing topic exchange permissions failed |
| 4.1.5 | Classic queue message loss during classic queue message store compaction (separate issue from 4.1.1) |
| 4.1.8 | Topic exchange binding deletions could leave orphaned trie edges in Khepri projection (memory leak) |
Enhancements of Note
| Release | Enhancement |
|---|---|
| 4.1.4 | RABBITMQ_MAX_OPEN_FILES environment variable supported by the startup script — useful in Kubernetes environments where soft limits are lower than hard limits |
| 4.1.1 | New health check endpoints: GET /api/health/checks/ready-to-serve-clients and GET /api/health/checks/below-node-connection-limit |
| 4.1.0 | Larger JWT tokens (up to 8192 bytes) supported before authentication — relevant to Lagoon's Keycloak-issued tokens |
Lagoon Service Inventory
Services that connect to RabbitMQ and their protocols:
| Service | Language | Library | Queue Type |
|---|---|---|---|
webhook-handler |
Node.js | amqp-connection-manager / amqplib ^0.10.7 |
Classic durable |
webhooks2tasks |
Node.js | amqp-connection-manager / amqplib ^0.10.7 |
Classic durable |
commons (shared) |
Node.js | amqp-connection-manager / amqplib ^0.10.7 |
Classic durable |
api |
Node.js | amqp-connection-manager / amqplib ^0.10.7 |
Classic durable |
actions-handler |
Go | cheshir/go-mq (AMQP 0-9-1) |
Classic durable |
logs2notifications |
Go | cheshir/go-mq (AMQP 0-9-1) |
Classic durable |
backup-handler |
Go | isayme/go-amqp-reconnect / streadway/amqp (AMQP 0-9-1) |
Classic durable |
All services use AMQP 0-9-1 with classic durable queues and direct exchanges. None use MQTT, streams, quorum queues, or AMQP 1.0.
Plugins
| Plugin | Version | Notes |
|---|---|---|
rabbitmq_delayed_message_exchange |
4.1.0 |
Used for lagoon-actions-delay and lagoon-webhooks-delay exchanges. No changes required — plugin version is compatible with RabbitMQ 4.1.x. |
Upgrade Path
This is a single-node local development broker (not a cluster). The upgrade path from 4.1.0 → 4.1.8 requires:
make build/brokerto rebuild the imagemake down && make up(or equivalent) to replace the running container
No data migration, feature flag enabling, or post-upgrade procedures are required for the 4.1.0 → 4.1.8 patch upgrade.
Verdict
Safe to upgrade. No action required beyond rebuilding the image. The 4.1.5 classic queue message loss bug fix is the most significant improvement and is a meaningful reliability gain for Lagoon's messaging pipeline.
This pull request updates the base image version for RabbitMQ in the
services/broker/Dockerfileto improve stability and security.Dependency update:
rabbitmq:4.1.0-management-alpinetorabbitmq:4.1.8-management-alpineinservices/broker/Dockerfile.