-
Notifications
You must be signed in to change notification settings - Fork 210
Backup fails: peer-list cannot access POD_NAMESPACE in sub-shell #2345
Description
Report
Backup fails with empty CLUSTER_SIZE - peer-list unable to access POD_NAMESPACE in sub-shell
More about the problem
Scheduled backups consistently fail because the peer-list command cannot access the POD_NAMESPACE environment variable when called from within the backup script's sub-shell.
The backup pod logs show:
++ get_backup_source
+++ /opt/percona/peer-list -on-start=/opt/percona/backup/lib/pxc/get-pxc-state.sh -service=pxc-db-pxc
+++ grep wsrep_cluster_size
+++ sort
+++ tail -1
+++ cut -d : -f 12
++ CLUSTER_SIZE=
++ '[' -z '' ']'
++ exit 1The CLUSTER_SIZE variable remains empty, causing the backup to fail immediately.
Steps to reproduce
- Deploy Percona XtraDB Cluster with the operator
- Configure backup storage with
containerOptions.envincludingPOD_NAMESPACE:
backup:
storages:
s3-ovh:
containerOptions:
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
s3:
bucket: my-bucket
credentialsSecret: my-secret
region: eu-west-par- Create a backup (scheduled or manual)
- Backup fails with empty
CLUSTER_SIZE
However, manual testing works:
# Exec into backup pod
kubectl exec -it <backup-pod> -n mysql -c xtrabackup -- sh
# Variable is set ✅
$ echo $POD_NAMESPACE
mysql
# Manual call works ✅
$ /opt/percona/peer-list -on-start=/opt/percona/backup/lib/pxc/get-pxc-state.sh -service=pxc-db-pxc
2026/01/17 15:20:32 Peer finder enter
2026/01/17 15:20:32 Peer list updated
now [pxc-db-pxc-0... pxc-db-pxc-1... pxc-db-pxc-2...]Root cause:
The /opt/percona/backup/backup.sh script calls peer-list in a sub-shell $(...) without exporting POD_NAMESPACE first. Sub-shells don't inherit non-exported variables, so the Go binary cannot see it.
function get_backup_source() {
CLUSTER_SIZE=$(/opt/percona/peer-list ...) # POD_NAMESPACE not visible here
...
}Versions
- Operator version: 1.18.0
- PXC image: percona/percona-xtradb-cluster:8.4
- Backup image: percona/percona-xtrabackup:8.4.0-3.1
- Deployment method: Helm chart (pxc-db-1.18.0)
Anything else?
Proposed fix:
Add export POD_NAMESPACE at the beginning of the get_backup_source() function in /opt/percona/backup/backup.sh:
function get_backup_source() {
export POD_NAMESPACE # <-- Add this line
CLUSTER_SIZE=$(/opt/percona/peer-list -on-start=/opt/percona/backup/lib/pxc/get-pxc-state.sh -service="$PXC_SERVICE" 2>&1 \
| grep wsrep_cluster_size \
| sort \
| tail -1 \
| cut -d : -f 12)
...
}Alternative fix:
Pass -ns explicitly to all peer-list calls:
CLUSTER_SIZE=$(/opt/percona/peer-list -ns="${POD_NAMESPACE}" -on-start=... -service="$PXC_SERVICE" ...)I'm not sure if this is a bug or if I'm missing something in my configuration. Any guidance would be appreciated!