Add StorageVersionMigrator controller to clean up storedVersions#5011
Add StorageVersionMigrator controller to clean up storedVersions#5011ChrisJBurns wants to merge 11 commits into
Conversation
Prepares the operator manager for an upcoming controller that watches CustomResourceDefinition objects. No runtime behaviour change yet. Part of #4969. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the kubebuilder marker
+kubebuilder:metadata:labels=toolhive.stacklok.dev/auto-migrate-storage-version=true
to every v1beta1 root type and regenerates the CRD manifests. The label
is the opt-in signal for an upcoming StorageVersionMigrator controller
that will reconcile each CRD's status.storedVersions down to the current
storage version. List types are intentionally not labelled — the label
applies to the CRD itself, which is keyed on the root type.
Part of #4969.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces a new operator controller that watches opted-in toolhive.stacklok.dev CRDs and reconciles each one's status.storedVersions down to the current storage version, so a future release can drop deprecated versions (e.g. v1alpha1) from spec.versions without orphaning etcd objects. Mechanism: for each CR of a managed kind, issue a metadata-only Server-Side Apply against the /status subresource with a dedicated field manager (thv-storage-version-migrator). SSA on /status bypasses the admission chain (ToolHive has validating webhooks on several root types) while still forcing the API server to re-encode each object at the current storage version. Once every CR succeeds, the CRD's status.storedVersions is patched to [<currentStorageVersion>] with an optimistic-lock merge. Scoping: CRDs must carry both the auto-migrate label (set via kubebuilder marker in a previous commit) and a name under toolhive.stacklok.dev. Wired into main.go behind the ENABLE_STORAGE_VERSION_MIGRATOR feature flag (default on). Helm chart surface + envtest coverage land in subsequent commits. Part of #4969. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Seven table-driven scenarios exercising the reconciler against a real
kube-apiserver + etcd via envtest:
- noop when storedVersions is already [storageVersion]
- happy path: [v1alpha1,v1beta1] -> [v1beta1] with CRs re-stored
- foreign-group CRDs are skipped
- unlabelled toolhive CRDs are skipped
- fallback to main-resource SSA when /status is absent
- pagination across the continue-token loop (verified via a counting
APIReader wrapper since metadata-only SSAs leave no managedFields
fingerprint to assert on)
- partial failure leaves storedVersions untouched (failure injected
via a client wrapper that fails Apply for one specific CR key)
Each test builds its own CRD fixture and calls Reconcile directly; the
suite does not start a manager so tests are deterministic and free from
races with a background controller. Runs via `task operator-test-integration`
alongside the existing virtualmcp suite.
Part of #4969.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Guards against the main footgun of the opt-in-label design for storage-version migration: if a new CRD root type is added to cmd/thv-operator/api/v1beta1/ without the migrate-marker, the StorageVersionMigrator controller silently excludes it. The problem surfaces only when a future release tries to drop a deprecated version — at which point it is far too late to fix. The test scans every root type (marker block contains both +kubebuilder:object:root=true and +kubebuilder:storageversion) and fails unless the block also contains either the migrate marker or an explicit +thv:storage-version-migrator:exclude sibling marker. Part of #4969. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds operator.features.storageVersionMigrator (default true) plus the ENABLE_STORAGE_VERSION_MIGRATOR env var wiring in the deployment template, mirroring the existing server/registry/virtualMCP pattern. Helm docs regenerated. Admins who run kube-storage-version-migrator externally can disable the in-operator controller by setting this flag to false. Part of #4969. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers what the controller does, the opt-in label contract, how to disable the controller operator-wide, the per-CRD escape hatch and its interaction with GitOps, how the migrator interacts with future version-removal releases, and the required RBAC. Part of #4969. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jhrozek
left a comment
There was a problem hiding this comment.
A few things I'd like to flag. The two bigger ones are about the SSA body being empty and about swallowed conflicts before we trim storedVersions, curious whether those are intentional. The rest are smaller.
| if err := r.restoreOne(ctx, gvk, u, hasStatusSubresource); err != nil { | ||
| if apierrors.IsNotFound(err) || apierrors.IsConflict(err) { | ||
| logger.V(1).Info("skip CR — deleted or updated elsewhere", | ||
| "object", client.ObjectKeyFromObject(u), "err", err) |
There was a problem hiding this comment.
IsNotFound is fine to swallow, but I'm not sure IsConflict is. The comment says "storage is already fresh", but a concurrent writer using the deprecated API version (still served during the deprecation window) produces a write encoded at the old version. If our SSA races with that writer and comes back Conflict, the CR is still stored at the old version, we silently skip it, and then patchStoredVersions trims storedVersions to [storageVersion]. That leaves storedVersions claiming we're clean while there's still an old-version-encoded object in etcd.
Options: retry the apply on Conflict (bounded), or track a "swallowed-conflict" flag and refuse to trim storedVersions on the pass where any occurred. Either way I don't think we should treat Conflict the same as NotFound here.
There was a problem hiding this comment.
Agreed — fixed in dad30c9. restoreCRs now tracks per-pass conflict count in a function-local counter. IsConflict is logged at V(1) and skipped per-CR (still treated as transient), but if any conflicts occurred and no other errors did, the function returns a sentinel errMigrationRetriedDueToConflicts. Reconcile propagates it without calling patchStoredVersions, so storedVersions stays at [v1alpha1, v1beta1] until a conflict-free pass succeeds.
Added a new test ConflictDuringRestoreLeavesStoredVersionsUntouched that injects an apierrors.NewConflict(...) for one CR via a wrapping client, asserts the sentinel error is returned, asserts storedVersions is untouched, then drops the injection and verifies the next reconcile completes cleanly.
IsNotFound is still a clean swallow with no error — the object is gone, it can't be at the old version anymore.
|
|
||
| //+kubebuilder:rbac:groups=apiextensions.k8s.io,resources=customresourcedefinitions,verbs=get;list;watch | ||
| //+kubebuilder:rbac:groups=apiextensions.k8s.io,resources=customresourcedefinitions/status,verbs=update;patch | ||
| //+kubebuilder:rbac:groups=toolhive.stacklok.dev,resources=*,verbs=get;list;patch |
There was a problem hiding this comment.
Minor, just wanted to surface this for sign-off. The wildcard grants get/list/patch across every current and future kind in the toolhive group, not just the opted-in ones. I get that the opted-in set isn't known at codegen time, so the alternative (explicit per-kind rules) is awkward. Probably fine, but worth a brief comment on the marker noting the intentional wildcard plus the isManagedCRD gate as defense-in-depth.
There was a problem hiding this comment.
Added the comment in dad30c9 — explicit note above the kubebuilder marker block explaining the wildcard is intentional (opt-in set isn't known at codegen time), with isManagedCRD documented as the runtime gate forming defence in depth.
While I was there: the verbs themselves changed too — patch → update for both the main resource and /status to match the new write path (Update + annotation toggle, see your other comment). RBAC regenerated via task operator-manifests.
…n toggle Empirical verification (probe test against envtest) confirmed the reviewer's hypothesis: metadata-only Server-Side Apply on /status is short-circuited by the apiserver — resourceVersion does not bump and etcd is not re-encoded. The "force a re-encode" claim the original implementation rested on was hollow. This commit replaces the SSA path with the kube-storage-version-migrator pattern: Get the live object, toggle a timestamp annotation to force a real diff, then Update. Trade-off acknowledged in the controller docstring and user docs: the Update goes through the main resource so admission webhooks see it. Webhooks that need to ignore migration-only writes can branch on the presence of toolhive.stacklok.dev/storage-version-migrated-at in the diff. The original /status webhook-bypass benefit was always notional — the SSA wasn't writing etcd at all. Other fixes from the same review pass: - Track per-pass IsConflict count in restoreCRs; if any conflicts were swallowed and no other errors occurred, return a sentinel error so Reconcile leaves storedVersions untouched (the post-conflict state of the affected object is unverified). Next reconcile retries cleanly. New ConflictDuringRestoreLeavesStoredVersionsUntouched test covers this. - Bound migrationCache memory: register a manager.RunnableFunc that walks the cache every 10m and evicts expired entries. Prior code only evicted lazily on key re-query, which leaked entries for deleted CRs. - Simplify isMigrationNeeded to a single line. The servedCount > 1 branch was forcing list passes during deprecation windows for no actual benefit (None conversion + identical schemas means writers cannot reintroduce stale storedVersions). - Document the wildcard CR RBAC as defence-in-depth alongside the isManagedCRD runtime gate. - Fix the stale "(unused in this reconciler)" comment on the embedded Client field — it is used for CR Updates and the CRD status patch. - Drop a 100ms time.Sleep in AfterSuite that was dead code copied from virtualmcp's suite (which has a manager goroutine to drain; we don't). HappyPath now snapshots each CR's resourceVersion before reconcile and asserts it has bumped after — empirical proof the underlying re-encode actually happened. All 8 envtest cases pass; lint clean. Refs PR #5011 review by @jhrozek. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The annotation toggle was a workaround for a misdiagnosed problem.
Earlier in this branch we observed empty SSAs being elided and
generalised it to "all unmodified writes elide" — and added a
toolhive.stacklok.dev/storage-version-migrated-at annotation toggle
to force a real diff per write. A second-opinion investigation
pointed out the elision check happens AFTER storage encoding: the
apiserver re-encodes the request body at the current storage version,
then byte-compares against etcd. For the actual migration scenario
(CR stamped at an older apiVersion in etcd, storage version now newer)
the encoded apiVersion differs, the comparison fails, and the write
proceeds. Verified empirically: a CR stored at v1alpha1 with storage
flipped to v1beta1 has plain Get+Update bump RV from 202 to 204; a CR
already at v1beta1 has plain Get+Update elide cleanly. Both behaviours
are correct — the elision is exactly the no-migration-needed case.
Changes:
- restoreOne: plain Get + Update of the unmodified live object. No
annotation, no timestamp, no MigrationTimestampAnnotation constant.
- Controller docstring rewritten to describe the actual mechanism
(encode-then-compare elision) and cite ksvm#65 for posterity.
- HappyPath envtest loosened: per-CR resourceVersion bump assertions
are dropped (already-clean CRs correctly elide; checking RV bumps
on them was an artefact of the annotation-induced writes). The
storedVersions convergence assertion remains as the contract test.
- New cross-version envtest "re-encodes CRs that are stored at a
prior storage version" exercises the real migration scenario:
install CRD with v1alpha1 storage, create CR, flip storage to
v1beta1, run reconciler, assert storedVersions trims and the CR's
RV bumped (proving the apiserver actually rewrote etcd).
- User docs updated to drop annotation references and describe the
real mechanism. Reference implementation citation switched from
cluster-api/crdmigrator to kube-storage-version-migrator (the
actual upstream SIG tool we share semantics with).
- FallsBackWhenNoStatusSubresource test removed: the SSA path it
exercised no longer exists. Plain Update on a CRD without /status
is just a normal Update; covered by the happy-path spec.
8/8 envtest cases pass. Lint clean. No RBAC drift (verbs already
correct).
Refs PR #5011 review by @jhrozek.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Design correction — withdrawing the "upstream is broken" claimI owe a correction on the design pivot from The corrected understanding (verified against ksvm#65 and confirmed empirically against our envtest in
My earlier probes only tested the second case (CR already at storage version), saw the elision, and overgeneralised. The probe in Practical consequence for this PR: the annotation toggle ( Apologies to the upstream maintainers of |
…ion-controler # Conflicts: # cmd/thv-operator/main.go # deploy/charts/operator/README.md # deploy/charts/operator/templates/clusterrole/role.yaml # deploy/charts/operator/templates/deployment.yaml # deploy/charts/operator/values.yaml
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
End-to-end reproducible verification for the StorageVersionMigrator: deploy v0.21.0 (last v1alpha1-only release), create one CR of each of the 12 toolhive kinds at v1alpha1, upgrade to multi-version CRDs + this branch's operator with the migrator explicitly disabled, re-apply at v1beta1, confirm storedVersions is stuck at [v1alpha1, v1beta1] on all 12 CRDs. Then enable the migrator and watch all 12 converge to [v1beta1] within ~1s. Includes an optional step 11 that simulates the next release dropping v1alpha1 from spec.versions — proving the migrator's storedVersions trim is what unblocks that change. This is what was actually run to validate PR #5011 against a real kind cluster. Putting it in the docs tree (not the worktree root) so it's discoverable and reproducible after the PR merges. CR fixtures (crs-v1alpha1.yaml and crs-v1beta1.yaml) originate from the graduate-crds upgrade-test fixtures; copying them in keeps this guide self-contained. Part of #4969. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5011 +/- ##
==========================================
- Coverage 68.37% 68.18% -0.20%
==========================================
Files 624 625 +1
Lines 63442 63643 +201
==========================================
+ Hits 43381 43393 +12
- Misses 16820 17015 +195
+ Partials 3241 3235 -6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Adds a new
StorageVersionMigratorcontroller to the ToolHive operator that watches opted-intoolhive.stacklok.devCRDs and reconciles each one'sstatus.storedVersionsdown to the current storage version, so a future release can drop deprecated API versions (e.g.v1alpha1) fromspec.versionswithout orphaning etcd objects.Closes #4969
Medium level
cmd/thv-operator/controllers/storageversionmigrator_controller.go. Watchesapiextensions.k8s.io/v1/CustomResourceDefinition(metadata-only), filtered by opt-in label + group. Per-CRD: readsspec.versionsto find the storage version, lists every CR via paginated unstructured list, then for each CR does a plainGet+Updateof the unmodified live object. The apiserver re-encodes the request body at the current storage version and byte-compares against etcd; when the CR was stored at a differentapiVersion(the migration scenario) the comparison fails and the write proceeds, re-encoding the object. When the CR is already at the current storage version, the comparison matches and the apiserver harmlessly elides the write — there was nothing to migrate. This matches upstreamkube-storage-version-migrator's mechanism. Once every CR has been processed, patchesstatus.storedVersionsvia optimistic-lock merge. 1h migration cache (with periodic background GC) dedupes Get+Update RPCs across reconciles.+kubebuilder:metadata:labels=toolhive.stacklok.dev/auto-migrate-storage-version=truemarker, baked into the generated CRD YAML bytask operator-manifests. New CRDs opt in by adding the same marker; CRDs that shouldn't be migrated omit it. A CI test (TestV1beta1TypesMarkerCoverage) fails the build if a root type is added without either this marker or an explicit+thv:storage-version-migrator:excludesibling.operator.features.storageVersionMigratorin the helm chart (defaulttrue) wires toENABLE_STORAGE_VERSION_MIGRATORfollowing the existingENABLE_SERVER/ENABLE_REGISTRY/ENABLE_VMCPpattern. Admins who prefer to runkube-storage-version-migratorexternally can disable it.IsConflict(concurrent writer), the controller refuses to trimstoredVersionson that pass and returns a sentinel error so the next reconcile retries cleanly. The post-conflict state of the affected object is unverified, so trimming would be reasoning from an unproven premise.cmd/thv-operator/test-integration/storageversionmigrator/with 8 scenarios: noop-when-clean, happy-path-trims-storedVersions, re-encodes CRs that are stored at a prior storage version (the load-bearing test that proves the migration mechanism actually rewrites etcd — installs a CRD withv1alpha1storage, creates a CR, flips storage tov1beta1, asserts the CR'sresourceVersionbumped after reconcile), partial-failure leaves storedVersions untouched, conflict-during-restore leaves storedVersions untouched, skips-foreign-CRDs, skips-unlabelled-CRDs, pagination (continue-token loop verified via a counting APIReader wrapper). Each test installs its own CRD fixture and calls Reconcile directly — no background manager, no timing races.docs/operator/storage-version-migration.mdcovers the actual write mechanism, webhook interaction, the opt-in label contract, feature-flag opt-out, the admin per-CRD escape hatch, and coordination with the futurev1alpha1-removal release.Low level
cmd/thv-operator/controllers/storageversionmigrator_controller.gocmd/thv-operator/controllers/marker_coverage_test.goapi/v1beta1/*_types.goand fails if a root type lacks either the migrate marker or an explicit exclude markercmd/thv-operator/test-integration/storageversionmigrator/suite_test.gocmd/thv-operator/test-integration/storageversionmigrator/controller_test.gocmd/thv-operator/main.goapiextensions/v1scheme, addfeatureStorageVersionMigratorconst,setupStorageVersionMigrator()wiringcmd/thv-operator/api/v1beta1/*_types.go(12 files)+kubebuilder:metadata:labels=...marker to each root typedeploy/charts/operator-crds/files/crds/*.yaml+templates/*.yaml(12 × 2)metadata.labelsdeploy/charts/operator/templates/clusterrole/role.yamlcustomresourcedefinitionsget/list/watch +/statusupdate;patch +toolhive.stacklok.dev/*get/list/update +*/statusupdatedeploy/charts/operator/values.yamloperator.features.storageVersionMigrator: truedeploy/charts/operator/templates/deployment.yamlENABLE_STORAGE_VERSION_MIGRATORenv vardeploy/charts/operator/README.mdtask helm-docsdocs/operator/storage-version-migration.mdType of change
Test plan
task lint-fix— 0 issuestask build— cleango test $(go list ./cmd/thv-operator/... | grep -v /test-integration)) — all greentask operator-test-integrationpath, or directly withKUBEBUILDER_ASSETS=$(setup-envtest use 1.31.0 -p path) go test ./cmd/thv-operator/test-integration/storageversionmigrator/)task operator-manifests+task helm-docsproduce no drift against committed statehelm template deploy/charts/operatorrenders the new env var correctlySpecial notes for reviewers
The write mechanism in this PR has been through three iterations during review — final state described below; the journey is captured in the inline comment threads.
The current implementation is plain
Get+Updateof the unmodified CR — exactly what upstreamkube-storage-version-migratordoes. The Kubernetes apiserver's storage layer re-encodes the request body at the current storage version, then byte-compares against etcd. When the CR was originally stored at an olderapiVersion(the actual migration scenario), the encoded apiVersion stamp differs from etcd's, the comparison fails, and the write proceeds — re-encoding the object. When the CR is already at the current storage version, the bytes match and the apiserver elides the write harmlessly. This is the upstream maintainers' explicit reasoning, captured at kube-storage-version-migrator#65.A new envtest case (
re-encodes CRs that are stored at a prior storage version) proves the migration mechanism empirically: install a CRD withv1alpha1as storage, create a CR (etcd writesapiVersion: v1alpha1bytes), flip storage tov1beta1, run the reconciler, and assert the CR'sresourceVersionbumped. Observed bump 202→204; the elision sanity check on a same-storage-version CR shows RV stays at 204 as expected.PR size: ~450 production lines across 15 files, slightly over the 400-line / 10-file convention. 12 of those files are 1-line kubebuilder marker additions in
api/v1beta1/*_types.go; the substantive Go work is 3 files (controller, main.go wiring, helm values/deployment). I did not split because the controller + opt-in labels + marker-coverage CI test are one cohesive feature — splitting would mean landing unused labels or an unused controller mid-stack.Why this PR must ship at least one release before
v1alpha1is removed: the migrator needs time to walk every cluster'sstoredVersionsduring the deprecation window. Shipping it in the same release as the version removal provides zero benefit because clusters jumping straight to that release won't have run the migrator yet. Documented indocs/operator/storage-version-migration.md.Follow-ups not in scope for this PR:
🤖 Generated with Claude Code