-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
What is the issue?
We are evaluating Flagger (1.42) in combination with Linkerd and the Gateway API CRDs to do canary rollouts.
Upon doing some experiments, we saw that canaries were not progressing and stuck on this line on the Flagger CRD:
Warning Synced flagger HTTPRoute echo-blue.tools parent echo-blue is not ready
(status: True, observed generation: 0, current generation: 1)
The HTTPRoutes status object looks like this:
status:
parents:
- controllerName: linkerd.io/policy-controller
parentRef:
group: core
kind: Service
name: echo-blue
namespace: tools
port: 80
conditions:
- type: Accepted
status: "True"
reason: Acceptedwhich looks like the Linkerd controller happily reconciled the HTTPRoute.
When looked at the flagger source and there is a if-statement that blocks progess:
if condition.ObservedGeneration < currentGeneration {... }And yes, if you look back the YAML snippet above, there is no observed generation!
Looking into the linkerd code, I saw that the observed generation is always set to None.
The Gateway API Spec states that controllers must use the field and set it to metdata.generation field. This is how staleness can be detected.
Interestingly, version 1.41 of Flagger does not have that if-statement. It was added to alleviate some race conditions with cloud providers.
How can it be reproduced?
Let the linkerd controller reconcile any HTTPRoute for any reason. I will not have a observedGeneration even though it should.
Logs, error output, etc
Not a Linkerd error, but an error from Flagger setting an expectation for what Linkerd should do:
Warning Synced flagger HTTPRoute echo-blue.tools parent echo-blue is not ready
(status: True, observed generation: 0, current generation: 1)
output of linkerd check -o short
▧ ❯ linkerd check -o short
linkerd-identity
‼ issuer cert is valid for at least 60 days
issuer certificate will expire on 2026-03-23T08:31:14Z
see https://linkerd.io/2.14/checks/#l5d-identity-issuer-cert-not-expiring-soon for hints
linkerd-webhooks-and-apisvc-tls
‼ proxy-injector cert is valid for at least 60 days
certificate will expire on 2026-03-17T16:01:04Z
see https://linkerd.io/2.14/checks/#l5d-proxy-injector-webhook-cert-not-expiring-soon for hints
‼ sp-validator cert is valid for at least 60 days
certificate will expire on 2026-03-17T16:01:04Z
see https://linkerd.io/2.14/checks/#l5d-sp-validator-webhook-cert-not-expiring-soon for hints
‼ policy-validator cert is valid for at least 60 days
certificate will expire on 2026-03-17T16:01:08Z
see https://linkerd.io/2.14/checks/#l5d-policy-validator-webhook-cert-not-expiring-soon for hints
linkerd-version
‼ cli is up-to-date
unsupported version channel: stable-2.14.10
see https://linkerd.io/2.14/checks/#l5d-version-cli for hints
control-plane-version
‼ control plane is up-to-date
unsupported version channel: enterprise-2.19.4
see https://linkerd.io/2.14/checks/#l5d-version-control for hints
‼ control plane and cli versions match
control plane running enterprise-2.19.4 but cli running stable-2.14.10
see https://linkerd.io/2.14/checks/#l5d-version-control for hints
linkerd-control-plane-proxy
× control plane proxies are healthy
pod "linkerd-destination-7fc4df9d8d-dpvgz" status is Running
see https://linkerd.io/2.14/checks/#l5d-cp-proxy-healthy for hints
/ Running viz extension check
(this hung after a while, we don't have viz installed, I believe)
Environment
kubectl version
Client Version: v1.34.1
Kustomize Version: v5.7.1
Server Version: v1.35.0-33+37970203ae1a44
running on Amazon EKS.
Possible solution
I have looked at the code and I think its a matter of threading metadata.generation through and patching it.
Additional context
No response
Would you like to work on fixing this bug?
yes