Skip to content

Conversation

@nixpanic
Copy link
Member

@nixpanic nixpanic commented Dec 4, 2025

The new hostname of the NFS-server is stored in a journal entry for the
volume. The new NFS-server will only be used the next time the NFS
Volume is mounted on a worker node.

It is not possible to update the VolumeContext of a Volume, so the old
hostname of the NFS-server will be stored there forever.


Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated
    failure (please report the failure too!)

@mergify mergify bot added the component/nfs Issues related to NFS label Dec 4, 2025
@nixpanic nixpanic force-pushed the nfs/ControllerModifyVolume branch from d006863 to 1a11979 Compare December 4, 2025 17:08
Rakshith-R
Rakshith-R previously approved these changes Dec 10, 2025
Copy link
Collaborator

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nixpanic is this tested code? i dont see ControllerServiceCapability_RPC_MODIFY_VOLUME capability is exposed for nFS

@nixpanic
Copy link
Member Author

@nixpanic is this tested code? i dont see ControllerServiceCapability_RPC_MODIFY_VOLUME capability is exposed for nFS

Still trying to add some e2e tests, the csi-provisioner and csi-resizer sidecars have just gained the ability to pass credentials to the ControllerModifyVolume request.

@nixpanic nixpanic marked this pull request as draft December 10, 2025 14:00
@nixpanic nixpanic force-pushed the nfs/ControllerModifyVolume branch from 1a11979 to ca55fc1 Compare December 12, 2025 10:18
@mergify mergify bot dismissed Rakshith-R’s stale review December 12, 2025 10:19

Pull request has been modified.

@nixpanic
Copy link
Member Author

/test ci/centos/mini-e2e-helm/k8s-1.34/nfs

@nixpanic
Copy link
Member Author

/test ci/centos/mini-e2e/k8s-1.34/nfs

@nixpanic
Copy link
Member Author

/test ci/centos/mini-e2e/k8s-1.34

@nixpanic
Copy link
Member Author

@nixpanic is this tested code? i dont see ControllerServiceCapability_RPC_MODIFY_VOLUME capability is exposed for nFS

Verified 😜 that it fails with this:

  I1212 12:26:38.595866 75590 pvc.go:85] PVC cephcsi-nfs-pvc Event: ProvisioningFailed - CSI driver does not support VolumeAttributesClass: controller MODIFY_VOLUME capability is not reported

@nixpanic nixpanic force-pushed the nfs/ControllerModifyVolume branch from ca55fc1 to fe5d071 Compare December 12, 2025 12:35
@nixpanic
Copy link
Member Author

/test ci/centos/mini-e2e/k8s-1.34

2 similar comments
@nixpanic
Copy link
Member Author

/test ci/centos/mini-e2e/k8s-1.34

@nixpanic
Copy link
Member Author

/test ci/centos/mini-e2e/k8s-1.34

@nixpanic
Copy link
Member Author

/test ci/centos/mini-e2e/k8s-1.34/cephfs

@nixpanic
Copy link
Member Author

/test ci/centos/mini-e2e/k8s-1.34/cephfs

@nixpanic nixpanic force-pushed the nfs/ControllerModifyVolume branch from fc84b94 to df7f5dd Compare December 17, 2025 09:08
@nixpanic
Copy link
Member Author

/test ci/centos/mini-e2e/k8s-1.34/cephfs

@nixpanic nixpanic force-pushed the nfs/ControllerModifyVolume branch from df7f5dd to 0f12bea Compare December 17, 2025 12:19
@nixpanic
Copy link
Member Author

/test ci/centos/mini-e2e/k8s-1.34/cephfs

@nixpanic
Copy link
Member Author

Still trying to use the old nfs-server:

  Warning  FailedMount  8s (x13 over 10m)  kubelet            MountVolume.SetUp failed for volume "pvc-c64d91d7-9097-46f3-9594-eb993e5851b7" : rpc error: code = Internal desc = nfs: failed to mount "relocated.example.net:/0001-0024-e044d94f-5dfb-4f7b-a7e6-44895f196074-0000000000000001-38cbb13b-1f15-4a9b-80ff-3017574691a1" to "/var/lib/kubelet/pods/4ebe7989-0889-4af1-b138-48c67a90debd/volumes/kubernetes.io~csi/pvc-c64d91d7-9097-46f3-9594-eb993e5851b7/mount" : mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs relocated.example.net:/0001-0024-e044d94f-5dfb-4f7b-a7e6-44895f196074-0000000000000001-38cbb13b-1f15-4a9b-80ff-3017574691a1 /var/lib/kubelet/pods/4ebe7989-0889-4af1-b138-48c67a90debd/volumes/kubernetes.io~csi/pvc-c64d91d7-9097-46f3-9594-eb993e5851b7/mount
Output: mount.nfs: Failed to resolve server relocated.example.net: Name or service not known
 stderr: ""

@nixpanic
Copy link
Member Author

It seems I missed handling of mutable_parameters in CreateVolume CSI procedure.

@nixpanic nixpanic force-pushed the nfs/ControllerModifyVolume branch from 0f12bea to dc81558 Compare December 19, 2025 15:55
@nixpanic
Copy link
Member Author

/test ci/centos/mini-e2e/k8s-1.34/cephfs

1 similar comment
@nixpanic
Copy link
Member Author

nixpanic commented Jan 6, 2026

/test ci/centos/mini-e2e/k8s-1.34/cephfs

@nixpanic nixpanic force-pushed the nfs/ControllerModifyVolume branch from dc81558 to 8899dcc Compare January 6, 2026 13:24
@mergify mergify bot added dequeued and removed queued labels Feb 1, 2026
@nixpanic nixpanic force-pushed the nfs/ControllerModifyVolume branch from ea99e05 to 29196c5 Compare February 3, 2026 10:54
@mergify mergify bot dismissed stale reviews from Rakshith-R and Madhu-1 February 3, 2026 10:54

Pull request has been modified.

@mergify mergify bot removed the dequeued label Feb 3, 2026
@nixpanic
Copy link
Member Author

nixpanic commented Feb 3, 2026

It seems there has been an update to the csi-resizer:canary image and watch is required now too:

  E0201 19:25:29.550775       1 reflector.go:205] "Failed to watch" err="volumeattributesclasses.storage.k8s.io is forbidden: User \"system:serviceaccount:cephcsi-e2e-ae4861ac:nfs-csi-provisioner\" cannot watch resource \"volumeattributesclasses\" in API group \"storage.k8s.io\" at the cluster scope" logger="UnhandledError" reflector="k8s.io/client-go/informers/factory.go:160" type="*v1.VolumeAttributesClass"

Updated the RBAC for the provisioner, please have a look again.

@nixpanic
Copy link
Member Author

nixpanic commented Feb 3, 2026

/test ci/centos/mini-e2e/k8s-1.34

@nixpanic nixpanic requested review from a team, Madhu-1 and Rakshith-R February 3, 2026 10:56
@nixpanic
Copy link
Member Author

nixpanic commented Feb 4, 2026

strange, it looks like these StorageClass keys are not accepted, but those were added to the external-provisioner 🤔

  • csi.storage.k8s.io/controller-modify-secret-name
  • csi.storage.k8s.io/controller-modify-secret-namespace

The logs show

  I0203 13:29:35.730236       1 csi-provisioner.go:170] "Version" version="v6.0.0"

Not sure that it is the canary image mentioned in build.env. Even stranger is that this has passed CI before, and inspection of the logs showed correct functioning.

All functions use a fail-early-continue-on-success flow,
NodeGetVolumeStats was not following this.

Signed-off-by: Niels de Vos <[email protected]>
@nixpanic nixpanic force-pushed the nfs/ControllerModifyVolume branch from 29196c5 to cd729f3 Compare February 4, 2026 09:57
@nixpanic
Copy link
Member Author

nixpanic commented Feb 4, 2026

/test ci/centos/mini-e2e/k8s-1.34

@nixpanic
Copy link
Member Author

nixpanic commented Feb 4, 2026

Not sure that it is the canary image mentioned in build.env. Even stranger is that this has passed CI before, and inspection of the logs showed correct functioning.

strange, it looks like these StorageClass keys are not accepted, but those were added to the external-provisioner 🤔

* csi.storage.k8s.io/controller-modify-secret-name

* csi.storage.k8s.io/controller-modify-secret-namespace

The logs show

  I0203 13:29:35.730236       1 csi-provisioner.go:170] "Version" version="v6.0.0"

Not sure that it is the canary image mentioned in build.env. Even stranger is that this has passed CI before, and inspection of the logs showed correct functioning.

strange, it looks like these StorageClass keys are not accepted, but those were added to the external-provisioner 🤔

* csi.storage.k8s.io/controller-modify-secret-name

* csi.storage.k8s.io/controller-modify-secret-namespace

The logs show

  I0203 13:29:35.730236       1 csi-provisioner.go:170] "Version" version="v6.0.0"

Not sure that it is the canary image mentioned in build.env. Even stranger is that this has passed CI before, and inspection of the logs showed correct functioning.

Added the canary image in commit "Testing: use canary images for csi-resizer/provisioner", not only in build.env, but also under deploy/..../nfs.

The new hostname of the NFS-server is stored in a journal entry for the
volume. The new NFS-server will only be used the next time the NFS
Volume is mounted on a worker node.

It is not possible to update the VolumeContext of a Volume, so the old
hostname of the NFS=server will be stored there forever.

Closes: ceph#5420
Signed-off-by: Niels de Vos <[email protected]>
Only the canary version suppors the VolumeAttributesClass with secrets
in the ControllerModifyVolume gRPC.

Signed-off-by: Niels de Vos <[email protected]>
The NFS node-plugin does not support staging, it only uses publishing.
For VolumeAttributeClass / ControllerModifyVolume the mutable parameters
are stored in the Ceph backend. This means that during volume publishing
the node-plugin needs to get the updated parameters from the Ceph
cluster (hence the publish secret requirement).

/tmp/csi has been added to the node-plugin Pod so that the temporary
credentials file can be written.

Signed-off-by: Niels de Vos <[email protected]>
Only Kubernetes 1.34 and newer support VolumeAttributeClass
functionality (GA). Ceph-CSI also depends on certain versions of the
Kubernetes CSI external-provisioner and external-resizer.

See-also: kubernetes-csi/docs#631
Signed-off-by: Niels de Vos <[email protected]>
@nixpanic nixpanic force-pushed the nfs/ControllerModifyVolume branch from cd729f3 to a38a8cc Compare February 4, 2026 11:40
@nixpanic
Copy link
Member Author

nixpanic commented Feb 4, 2026

/test ci/centos/mini-e2e/k8s-1.34

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/nfs Issues related to NFS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants