What happened:
All requests failed with 403 for one replica, while there was no such issue for another replica
<html>
<head><title>403 Forbidden</title></head>
<body>
<center><h1>403 Forbidden</h1></center>
<hr><center>nginx</center>
</body>
</html>
found following logs from ingress-nginx-controller
1757171414218 I0906 15:10:14.218599 7 controller.go:196] "Configuration changes detected, backend reload required"
1757171414276 I0906 15:10:14.276895 7 controller.go:216] "Backend successfully reloaded"
1757171414277 I0906 15:10:14.277043 7 event.go:377] Event(v1.ObjectReference{Kind:"Pod", Namespace:"test", Name:"nginx-ingress-controller-f454d46db-fl5zk", UID:"c7e7ed06-de43-4330-b3ad-51dc290b9494", APIVersion:"v1", ResourceVersion:"1215888925", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
1757171414283 I0906 15:10:14.283794 7 store.go:645] "secret was updated and it is used in ingress annotations. Parsing" secret="test/client.ca.truststore"
1757171414284 2025/09/06 15:10:14 [emerg] 27#27: cannot load certificate "/etc/ingress-controller/ssl/test-client-certificate.pem": PEM_read_bio_X509_AUX() failed (SSL: error:0480006C:PEM routines::no start line:Expecting: TRUSTED CERTIFICATE)
The issue was resolved after deleting the pod, no such issue for the new pod.
What you expected to happen:
Nginx should handle requests normally same with another replica.
our ingress has following annotations and spec
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/auth-tls-pass-certificate-to-upstream: "true"
nginx.ingress.kubernetes.io/auth-tls-secret: test/client.ca.truststore
nginx.ingress.kubernetes.io/auth-tls-verify-client: optional
name: test-ingress
namespace: test
spec:
rules:
- host: <host>
http:
paths:
- backend:
service:
name: front
port:
number: 8080
pathType: ImplementationSpecific
tls:
- hosts:
- <host>
secretName: client-certificate
From my analysis, it seems one of the certificate secret changed and caused all certificate files were wrote again.
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/store/store.go#L648
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/store/store.go#L1027
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/store/backend_ssl.go#L38
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/store/backend_ssl.go#L76
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/store/store.go#L958~L1000
while the reload happened due to configuration change from https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/controller.go#L213
It's race condition to me. Nginx read empty certificate (truncated when write the file from
|
err := os.WriteFile(pemFileName, []byte(sslCert.PemCertKey), file.ReadWriteByUser) |
) file and caused issue.
NGINX Ingress controller version (exec into the pod and run /nginx-ingress-controller --version):
/etc/nginx $ /nginx-ingress-controller --version
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: v1.13.1
Build: c8ce0d146a53fbdb94848548068001909767e2de
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.27.1
-------------------------------------------------------------------------------
Kubernetes version (use kubectl version):
Server Version: version.Info{Major:"1", Minor:"32", GitVersion:"v1.32.7", GitCommit:"158eee9fac884b429a92465edd0d88a43f81de34", GitTreeState:"clean", BuildDate:"2025-07-15T18:00:33Z", GoVersion:"go1.23.10", Compiler:"gc", Platform:"linux/amd64"}
How to reproduce this issue:
It's hard to reproduce. We have applied same change to many clusters (above 200), only two had such issue.
Anything else we need to know:
What happened:
All requests failed with 403 for one replica, while there was no such issue for another replica
found following logs from ingress-nginx-controller
The issue was resolved after deleting the pod, no such issue for the new pod.
What you expected to happen:
Nginx should handle requests normally same with another replica.
our ingress has following annotations and spec
From my analysis, it seems one of the certificate secret changed and caused all certificate files were wrote again.
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/store/store.go#L648
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/store/store.go#L1027
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/store/backend_ssl.go#L38
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/store/backend_ssl.go#L76
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/store/store.go#L958~L1000
while the reload happened due to configuration change from https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/controller.go#L213
It's race condition to me. Nginx read empty certificate (truncated when write the file from
ingress-nginx/internal/net/ssl/ssl.go
Line 183 in a031a08
NGINX Ingress controller version (exec into the pod and run
/nginx-ingress-controller --version):Kubernetes version (use
kubectl version):How to reproduce this issue:
It's hard to reproduce. We have applied same change to many clusters (above 200), only two had such issue.
Anything else we need to know: