Description
When a PerconaPGCluster is deleted with the percona.com/delete-pvc, percona.com/delete-ssl, and percona.com/delete-backups finalizers set, the secrets that those finalizers delete can be recreated by the crunchy reconciler and left behind after the cluster is fully gone.
Root Cause
The deletion flow in (*PGClusterReconciler) Reconcile is:
1. runFinalizers() ← deletes secrets HERE
2. Delete(postgresCluster) ← crunchy DeletionTimestamp set HERE
The delete-pvc finalizer deletes user secrets (labeled role=pguser) and the delete-ssl finalizer deletes all TLS secrets. These deletions happen before Delete(postgresCluster) is called, meaning the crunchy PostgresCluster is still fully alive and its reconciler is operational.
The crunchy reconciler registers Owns(&corev1.Secret{}) in its watch setup. When the secrets are deleted, Kubernetes immediately enqueues a reconcile event for the PostgresCluster owner. If that reconcile runs before Delete(postgresCluster) sets a DeletionTimestamp, the crunchy reconciler sees no deletion in progress and recreates all the missing secrets via its normal reconciliation path.
Why delete-backups makes it consistently reproducible
The delete-backups finalizer triggers deleteBackups, which deletes PerconaPGBackup objects. Each deleted backup object has a internal.percona.com/delete-backup finalizer, so its backup controller reconciler runs finishBackup. That function continuously:
- calls
c.Status().Update(crunchyCluster) (clearing ManualBackup status) — directly enqueues the crunchy reconciler
- updates the backup
Job object (removing FinalizerKeepJob) — another owned-object event that re-enqueues the crunchy reconciler
- retries every 5 seconds while waiting for
AnnotationBackupInProgress to clear
Each of these writes repeatedly wakes the crunchy reconciler over several seconds, making the race window large enough to hit reliably.
Expected Behavior
Secrets deleted by delete-pvc / delete-ssl finalizers should not be recreated. After the cluster is fully gone, no secrets belonging to it should remain.
Actual Behavior
Secrets are deleted by the finalizers, then immediately recreated by the crunchy reconciler (triggered by owned-object deletion events and backup controller writes to the crunchy cluster), and are left behind permanently after the PerconaPGCluster is gone.
Affected Components
- finalizer.go —
deletePVCAndSecrets, deleteTLSSecrets, runFinalizers
- controller.go — deletion flow ordering
- controller.go —
finishBackup concurrent writes to crunchy cluster
Fix Direction
The delete-pvc and delete-ssl finalizers must only run after the crunchy PostgresCluster is fully gone (i.e., after the wait for PostgresCluster deletion in the reconcile loop), not before. Deleting secrets while the crunchy reconciler is still operational will always be racy.
Steps to reproduce
# Deploy operator
kubectl apply --server-side -f https://raw.githubusercontent.com/percona/percona-postgresql-operator/v2.9.0/deploy/bundle.yaml
# Deploy PG cluster
kubectl apply -f https://raw.githubusercontent.com/percona/percona-postgresql-operator/v2.9.0/deploy/cr.yaml
# Add delete-backups finalizer
kubectl patch perconapgcluster cluster1 --type=merge -p '{
"metadata": {
"finalizers": [
"percona.com/delete-pvc",
"percona.com/delete-ssl",
"percona.com/delete-backups"
]
}
}'
# Wait for cluster to be ready
kubectl wait --for=jsonpath='{.status.state}'=ready pg cluster1
# Delete cluster
kubectl delete pg cluster1
# Assert that the secrets that were supposed to be deleted have been recreated
# Note that their age is just a few seconds old
kubectl get secret
Versions
- Kubernetes - 1.33.2
- Operator - 2.9.0
- Database - 18.3-1
Description
When a
PerconaPGClusteris deleted with thepercona.com/delete-pvc,percona.com/delete-ssl, andpercona.com/delete-backupsfinalizers set, the secrets that those finalizers delete can be recreated by the crunchy reconciler and left behind after the cluster is fully gone.Root Cause
The deletion flow in
(*PGClusterReconciler) Reconcileis:The
delete-pvcfinalizer deletes user secrets (labeledrole=pguser) and thedelete-sslfinalizer deletes all TLS secrets. These deletions happen beforeDelete(postgresCluster)is called, meaning the crunchyPostgresClusteris still fully alive and its reconciler is operational.The crunchy reconciler registers
Owns(&corev1.Secret{})in its watch setup. When the secrets are deleted, Kubernetes immediately enqueues a reconcile event for thePostgresClusterowner. If that reconcile runs beforeDelete(postgresCluster)sets aDeletionTimestamp, the crunchy reconciler sees no deletion in progress and recreates all the missing secrets via its normal reconciliation path.Why
delete-backupsmakes it consistently reproducibleThe
delete-backupsfinalizer triggersdeleteBackups, which deletesPerconaPGBackupobjects. Each deleted backup object has ainternal.percona.com/delete-backupfinalizer, so its backup controller reconciler runsfinishBackup. That function continuously:c.Status().Update(crunchyCluster)(clearingManualBackupstatus) — directly enqueues the crunchy reconcilerJobobject (removingFinalizerKeepJob) — another owned-object event that re-enqueues the crunchy reconcilerAnnotationBackupInProgressto clearEach of these writes repeatedly wakes the crunchy reconciler over several seconds, making the race window large enough to hit reliably.
Expected Behavior
Secrets deleted by
delete-pvc/delete-sslfinalizers should not be recreated. After the cluster is fully gone, no secrets belonging to it should remain.Actual Behavior
Secrets are deleted by the finalizers, then immediately recreated by the crunchy reconciler (triggered by owned-object deletion events and backup controller writes to the crunchy cluster), and are left behind permanently after the
PerconaPGClusteris gone.Affected Components
deletePVCAndSecrets,deleteTLSSecrets,runFinalizersfinishBackupconcurrent writes to crunchy clusterFix Direction
The
delete-pvcanddelete-sslfinalizers must only run after the crunchyPostgresClusteris fully gone (i.e., after the wait forPostgresClusterdeletion in the reconcile loop), not before. Deleting secrets while the crunchy reconciler is still operational will always be racy.Steps to reproduce
Versions