-
Notifications
You must be signed in to change notification settings - Fork 107
Description
Summary
When an external VirtualService with an invalid spec exists in the cluster, the net-istio-controller pod crashes with CrashLoopBackOff during informer sync. This happens because the VirtualService informer watches all VirtualServices cluster-wide, and a malformed VS can cause the informer's list/watch operation to fail during validation.
Problem
Root Cause
The VirtualService informer watches all VirtualServices in the cluster without filtering. When an invalid VS exists (e.g., created by another controller like Voyager), the informer fails to sync, causing the controller to crash.
Impact
- net-istio-controller pod enters CrashLoopBackOff
- Knative Service's VirtualService reconcile fails
- InferenceService (KServe) or Knative Services fail to become ready
Example Scenario
- External application (e.g., Voyager forum) creates a VirtualService with invalid spec
- net-istio-controller starts and attempts to sync VirtualService informer
- Informer fails to list/watch due to invalid VS spec
- Controller crashes before it can reconcile any Knative-owned VS
Proposed Solution (Not Fixed, Please feedback)
Add an opt-in environment variable ENABLE_VS_INFORMER_FILTERING_BY_LABEL that enables label-based filtering for the VirtualService informer.
When enabled, only VirtualServices with the networking.knative.dev/ingress label will be watched.
Key Points
- Opt-in approach: Default is
falseto maintain backward compatibility - Follows existing pattern: Similar to
ENABLE_SECRET_INFORMER_FILTERING_BY_CERT_UIDfor Secret filtering - Label-based filtering: Uses
networking.knative.dev/ingresslabel which is already present on Ingress-owned VirtualServices
Usage
Enable the feature by setting the environment variable:
env:
- name: ENABLE_VS_INFORMER_FILTERING_BY_LABEL
value: "true"Verification
- Set
ENABLE_VS_INFORMER_FILTERING_BY_LABEL=true - Create an external VirtualService without the
networking.knative.dev/ingresslabel - Verify net-istio-controller starts without crash
- Deploy a Knative Service and verify VirtualService is created correctly
- Check controller logs show no errors related to external VirtualService