Skip to content

kubectl rook-ceph rook purge-osd command fails #371

@tasgon

Description

@tasgon

A node in our cluster went down and had to be wiped, and I'm now trying to remove the OSDs that were on it. Unfortunately, I keep on running into this error:

% kubectl rook-ceph rook purge-osd 109
Info: Running purge osd command
2025/04/23 00:59:31 maxprocs: Updating GOMAXPROCS=16: determined from CPU quota
2025-04-23 00:59:31.433189 W | cephcmd: loaded admin secret from env var ROOK_CEPH_SECRET instead of from file
2025-04-23 00:59:31.433239 I | rookcmd: starting Rook v1.16.4 with arguments 'rook ceph osd remove --osd-ids=109 --force-osd-removal=false'
2025-04-23 00:59:31.433245 I | rookcmd: flag values: --force-osd-removal=false, --help=false, --log-level=INFO, --osd-ids=109, --preserve-pvc=false
2025-04-23 00:59:31.433249 I | ceph-spec: parsing mon endpoints: e=<ip1>:6789,j=<ip2>:6789,h=<ip3>:6789,a=<ip4>:6789,b=<ip5>:6789
2025-04-23 00:59:31.443860 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2025-04-23 00:59:31.444004 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2025-04-23 00:59:31.932175 C | rookcmd: failed to get osd dump: failed to unmarshal osd dump response: invalid character 'i' looking for beginning of value
Error: . failed to run command. command terminated with exit code 1%!(EXTRA string=failed to remove osd %s, string=109)

(I've omitted the mon IPs.)

This is happening with the v0.9.3 of the Krew plugin and I've tried with and without --force. Could this be fixed by upgrading our Rook version? I looked through the changelog and didn't see anything obvious about this.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions