Skip to content

K8SPS-497: handle SIGTERM in heartbeat entrypoint#1236

Draft
pooknull wants to merge 3 commits intomainfrom
K8SPS-497
Draft

K8SPS-497: handle SIGTERM in heartbeat entrypoint#1236
pooknull wants to merge 3 commits intomainfrom
K8SPS-497

Conversation

@pooknull
Copy link
Copy Markdown
Contributor

@pooknull pooknull commented Mar 16, 2026

https://perconadev.atlassian.net/browse/K8SPS-497

DESCRIPTION

This PR handle SIGTERM in the heartbeat-entrypoint.sh and exits while waiting for MySQL initialization or clone completion. It also runs pt-heartbeat with exec so it receives shutdown signals directly

CHECKLIST

Jira

  • Is the Jira ticket created and referenced properly?
  • Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
  • Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

  • Is an E2E test/test case added for the new feature/change?
  • Are unit tests added where appropriate?

Config/Logging/Testability

  • Are all needed new/changed options added to default YAML files?
  • Are all needed new/changed options added to the Helm Chart?
  • Did we add proper logging messages for operator actions?
  • Did we ensure compatibility with the previous version or cluster upgrade process?
  • Does the change support oldest and newest supported PS version?
  • Does the change support oldest and newest supported Kubernetes version?

Copilot AI review requested due to automatic review settings March 16, 2026 11:18
@pull-request-size pull-request-size bot added the size/S 10-29 lines label Mar 16, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds graceful shutdown handling to the heartbeat container entrypoint so it can terminate cleanly (e.g., during Pod termination) instead of continuing to wait/start work after receiving SIGTERM.

Changes:

  • Added SIGTERM/SIGINT trap and a shutdown flag checked during the initialization/clone wait loops.
  • Switched to exec pt-heartbeat so pt-heartbeat becomes PID 1 and receives termination signals directly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@pooknull pooknull marked this pull request as ready for review March 17, 2026 10:08
Copilot AI review requested due to automatic review settings March 17, 2026 10:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves shutdown behavior for the heartbeat sidecar/utility by handling SIGTERM in build/heartbeat-entrypoint.sh so the script can exit cleanly while waiting on MySQL init/clone, and ensures pt-heartbeat receives termination signals directly.

Changes:

  • Add SIGTERM/SIGINT trap and early-exit checks during MySQL init/clone wait loops.
  • Improve clone-wait loop robustness (quoting in numeric comparisons; 1-second sleep intervals for responsiveness).
  • Run pt-heartbeat via exec so it becomes PID 1 and receives shutdown signals directly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 13 to 18
until [ ! -f "$DATA_DIR/bootstrap.lock" ] && [ ! -f "$DATA_DIR/clone.lock" ] && [ -S "$DATA_DIR/mysql.sock" ]; do
if [ "$shutdown_requested" -eq 1 ]; then
exit 0
fi
echo '[INFO] Waiting for MySQL initialization ...'
sleep 10
if [ "$shutdown_requested" -eq 1 ]; then
exit 0
fi
sleep 10
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computer has a point, please consider implementing this

@JNKPercona
Copy link
Copy Markdown
Collaborator

Test Name Result Time
async-ignore-annotations-8-4 passed 00:06:13
async-global-metadata-8-4 passed 00:15:34
async-upgrade-8-0 passed 00:13:26
async-upgrade-8-4 passed 00:12:34
auto-config-8-4 passed 00:24:58
config-8-4 passed 00:16:47
config-router-8-0 passed 00:07:29
config-router-8-4 passed 00:07:25
demand-backup-minio-8-0 passed 00:20:10
demand-backup-minio-8-4 passed 00:18:45
demand-backup-cloud-8-4 passed 00:22:38
demand-backup-retry-8-4 passed 00:16:22
async-data-at-rest-encryption-8-0 passed 00:13:08
async-data-at-rest-encryption-8-4 passed 00:13:05
gr-global-metadata-8-4 passed 00:13:32
gr-data-at-rest-encryption-8-0 passed 00:14:06
gr-data-at-rest-encryption-8-4 passed 00:14:38
gr-demand-backup-minio-8-4 passed 00:12:59
gr-demand-backup-cloud-8-4 passed 00:21:51
gr-demand-backup-haproxy-8-4 passed 00:10:32
gr-finalizer-8-4 passed 00:06:14
gr-haproxy-8-0 passed 00:04:42
gr-haproxy-8-4 passed 00:04:03
gr-ignore-annotations-8-4 passed 00:04:51
gr-init-deploy-8-0 passed 00:10:20
gr-init-deploy-8-4 passed 00:09:28
gr-one-pod-8-4 passed 00:05:34
gr-recreate-8-4 passed 00:17:45
gr-scaling-8-4 passed 00:07:54
gr-scheduled-backup-8-4 passed 00:16:04
gr-security-context-8-4 passed 00:09:43
gr-self-healing-8-4 passed 00:22:36
gr-tls-cert-manager-8-4 passed 00:08:39
gr-users-8-4 passed 00:05:28
gr-upgrade-8-0 passed 00:09:47
gr-upgrade-8-4 passed 00:10:25
haproxy-8-0 passed 00:08:18
haproxy-8-4 passed 00:08:30
init-deploy-8-0 passed 00:05:37
init-deploy-8-4 passed 00:05:38
limits-8-4 passed 00:06:14
monitoring-8-4 passed 00:13:49
one-pod-8-0 passed 00:05:42
one-pod-8-4 passed 00:05:25
operator-self-healing-8-4 passed 00:11:34
pvc-resize-8-4 passed 00:05:56
recreate-8-4 passed 00:12:45
scaling-8-4 passed 00:11:41
scheduled-backup-8-0 passed 00:17:14
scheduled-backup-8-4 failure 00:21:06
service-per-pod-8-4 passed 00:06:38
sidecars-8-4 passed 00:04:27
smart-update-8-4 passed 00:09:44
storage-8-4 passed 00:04:08
telemetry-8-4 passed 00:06:11
tls-cert-manager-8-4 passed 00:09:43
users-8-0 passed 00:08:34
users-8-4 passed 00:07:53
version-service-8-4 passed 00:20:49
Summary Value
Tests Run 59/59
Job Duration 01:40:46
Total Test Time 11:07:53

commit: 862a05f
image: perconalab/percona-server-mysql-operator:PR-1236-862a05f9

@hors hors added this to the v1.2.0 milestone Mar 17, 2026
@hors hors marked this pull request as draft March 17, 2026 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/S 10-29 lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants