Add hci-ironic VA variant for Ironic-provisioned HCI deployments#746
Add hci-ironic VA variant for Ironic-provisioned HCI deployments#746rebtoor wants to merge 1 commit intoopenstack-k8s-operators:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rebtoor The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Merge Failed. This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset. |
|
/recheck |
|
Merge Failed. This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset. |
|
recheck |
fultonj
left a comment
There was a problem hiding this comment.
Hey Roberto,
This looks really good. Would you please consider two changes?
- Add Markdown files
Please add markdown files which describe how to build this VA.
For example someone without ci-framework can browse to this URL:
https://github.com/openstack-k8s-operators/architecture/blob/main/examples/va/hci/README.md
and then build the CRs by following the directions and running kustomize.
The new markdown files should make it clear what this alternative to the hci VA are.
I suggest also updating the README for va/hci to state something along the lines of:
The steps in [Configuring and deploying the pre-Ceph dataplane](examples/va/hci/dataplane-pre-ceph.md)
assume that the compute nodes have been pre-provisioned. If you wish to pre-provision these nodes with
Ironic see ...
Consider one set of directions with two possible methods of deploying HCI.
One with pre-provisioned EDPM nodes and one without.
- Naming: Baremetal vs Ironic
I have a small concern that calling this "baremetal" implies that the original HCI is only for running in VMs which is not the case.
Would it be difficult in the naming for "baremetal" be replaced by "ironic" since the difference here is really about how the EDPM nodes were deployed?
Other than that this looks really good to me.
I built the CRs for both hci and hci-baremetal and all CRs are the same except nodeset-pre-ceph.yaml. Which is exactly what I would expect.
--- hci/nodeset-pre-ceph.yaml 2026-04-22 13:58:44.208210525 -0400
+++ hci-baremetal/nodeset-pre-ceph.yaml 2026-04-22 13:57:26.240705158 -0400
@@ -34,6 +34,13 @@
name: openstack-edpm
namespace: openstack
spec:
+ baremetalSetTemplate:
+ automatedCleaningMode: disabled
+ bmhLabelSelector:
+ app: openstack
+ bmhNamespace: openshift-machine-api
+ cloudUserName: cloud-admin
+ osImage: edpm-hardened-uefi.qcow2
env:
- name: ANSIBLE_FORCE_COLOR
value: "True"
@@ -175,7 +182,7 @@
subnetName: subnet1
- name: tenant
subnetName: subnet1
- preProvisioned: true
+ preProvisioned: false
services:
- bootstrap
- configure-network|
Hey John! Thanks for your careful review! I've addressed both of your requests, I agree with you that ironic > baremetal has much more sense in this context, 'cause the latter would have been misleading so i renamed the scenario (and the downstream job as well). |
Add a new validated architecture variant (hci-ironic) that deploys the VA-HCI scenario with all compute nodes provisioned via Ironic using a configurable baremetalSetTemplate.osImage, enabling validation of edpm-hardened-uefi qcow2 images through a complete deployment cycle. **New reusable component:** - `lib/dataplane/nodeset-baremetal` -- kustomize Component that maps `preProvisioned` and `baremetalSetTemplate` from the values ConfigMap into the `OpenStackDataPlaneNodeSet` spec. **New VA variant (`va/hci-ironic`):** - Identical to `va/hci` except the edpm-pre-ceph nodeset stage includes `nodeset-baremetal` alongside the standard `nodeset` component. - Pre-ceph nodeset values include `preProvisioned: false` and `baremetalSetTemplate` with a configurable `osImage`. - `SetupReady` timeout increased to 30m to account for Ironic provisioning time. - Post-ceph stage generates `edpm-nodeset-values` at the `hci` path so that SSH keys are available during `kustomize build` (the pre-ceph stage writes to the `hci-ironic` path, but the post-ceph kustomization references the `hci` path for shared resources). - All other stages (NNCP, networking, control-plane, deployments, Ceph bootstrap hook) reuse the existing `va/hci` paths. **Documentation:** - Added README and `dataplane-pre-ceph.md` for the hci-ironic variant. - Updated the `va/hci` README with a Variants section pointing to hci-ironic for Ironic-provisioned deployments. The standard `va/hci` is untouched -- existing pre-provisioned deployments are not affected. Closes: ANVIL-108 Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Roberto Alfieri <ralfieri@redhat.com>
| @@ -0,0 +1 @@ | |||
| ../hci/control-plane No newline at end of file | |||
There was a problem hiding this comment.
I assume this was an accidental symlink. Want to remove it in the next patchset?
[johfulto@laptop control-plane{baremetal}]$ ll
total 12K
-rw-r--r--. 1 johfulto johfulto 419 Jan 13 16:31 kustomization.yaml
drwxr-xr-x. 1 johfulto johfulto 44 Jan 13 16:31 networking/
-rw-r--r--. 1 johfulto johfulto 345 Apr 13 11:42 service-values.yaml
lrwxrwxrwx. 1 johfulto johfulto 20 Apr 23 14:47 control-plane -> ../hci/control-plane
[johfulto@laptop control-plane{baremetal}]$
| @@ -0,0 +1,14 @@ | |||
| # This is the kustomization for the FINAL step, edpm-post-ceph | |||
| # (hci-ironic variant: references hci-ironic pre-ceph values) | |||
| --- | |||
There was a problem hiding this comment.
examples/va/hci-ironic/kustomization.yaml as it is right now is broken [1].
The current README doesn't direct users to use it and automation/vars/hci-ironic.yaml does not use it either. Thus, I think it could just be removed.
[1]
[johfulto@laptop hci-ironic{baremetal}]$ kustomize build .
Error: accumulating resources: accumulation err='accumulating resources from 'control-plane/networking/nncp/values.yaml': evalsymlink failure on '/home/johfulto/claude/review/architecture/examples/va/hci-ironic/control-plane/networking/nncp/values.yaml' : lstat /home/johfulto/claude/review/architecture/examples/va/hci-ironic/control-plane: no such file or directory': must build at directory: not a valid directory: evalsymlink failure on '/home/johfulto/claude/review/architecture/examples/va/hci-ironic/control-plane/networking/nncp/values.yaml' : lstat /home/johfulto/claude/review/architecture/examples/va/hci-ironic/control-plane: no such file or directory
[johfulto@laptop hci-ironic{baremetal}]$
|
|
||
| 5. Between stages 3 and 4, _it is assumed that the user installs Ceph on the 3 OSP compute nodes._ OpenStack K8S CRDs do not provide a way to install Ceph via any sort of combination of CRs. | ||
|
|
||
| ## Variants |
| baremetalSetTemplate: | ||
| # CHANGEME - qcow2 image name from the edpm-hardened-uefi container | ||
| osImage: edpm-hardened-uefi.qcow2 | ||
| automatedCleaningMode: disabled |
There was a problem hiding this comment.
Would you please add the following comment above automatedCleaningMode ?
# set to 'metadata' if redeploying Ceph to ensure clean disks for OSDs
Or perhaps change disabled to metadata?
That would be consistent with what we used to recommend with TripleO:
When deploying Ceph, OSDs will not be created unless the disk is factory clean. This subtlety presents itself when people redeploy (since it's hard to get everything right the first time) and the Ceph install fails. Support used to get lots of calls about OSDs not getting created because the reader didn't know this.
I confirmed metadata is the correct setting as per:
Add a new validated architecture variant (hci-ironic) that deploys
the VA-HCI scenario with all compute nodes provisioned via Ironic
using a configurable baremetalSetTemplate.osImage, enabling validation
of edpm-hardened-uefi qcow2 images through a complete deployment cycle.
New reusable component:
preProvisioned and baremetalSetTemplate from the values ConfigMap
into the OpenStackDataPlaneNodeSet spec.
New VA variant (va/hci-ironic):
nodeset-baremetal alongside the standard nodeset component.
post-ceph, Ceph bootstrap hook) reuse the existing va/hci paths.
provisioning time.
Documentation:
to hci-ironic for Ironic-provisioned deployments.
The standard va/hci is untouched -- existing pre-provisioned
deployments are not affected.
Closes: ANVIL-108
Co-authored-by: Claude noreply@anthropic.com