Skip to content

Implement node group TemplateNodeInfo method#22

Open
visdmin wants to merge 13 commits intoUpCloudLtd:feat/cluster-autoscaler-cloudprovider-upcloudfrom
visdmin:feat/node-group-template-node-info
Open

Implement node group TemplateNodeInfo method#22
visdmin wants to merge 13 commits intoUpCloudLtd:feat/cluster-autoscaler-cloudprovider-upcloudfrom
visdmin:feat/node-group-template-node-info

Conversation

@visdmin
Copy link

@visdmin visdmin commented Feb 12, 2025

What type of PR is this?

/kind feature

What this PR does:

Add support for node groups with MinSize: 0

  • Adds additional upcloud-go-api components required for fetching server plans:
    • service/cloud.go
    • plan.go price.go timezone.go zone.go (required by service/cloud.go)
  • Fetch and store node group CPU count and memory amount from upc api
    Fetch all plans (api does not seem to have filtering support) and find the matching plan details for the group plan.
  • Store labels and taints for upCloudNodeGroup there are used by the node group TemplateNodeInfo method
  • Implements node group TemplateNodeInfo method
  • Changes to constants
    • change nodeGroupMinSize 1 -> 0
    • add nodeMaxPods 110
  • Unit tests:
    • Add test case that validates the NodeInfo returned by TemplateNodeInfo of empty node group
    • Add test case that validates that in case only single node group is defined it cannot have MinSize: 0
    • Change tests to include node group plan
    • Change tests to allow SupportScaleToZero: true
    • Change test to verify that node group with MinSize: 0 is valid and allowed when at least one other node group exists with MinSize >= 1.

Why we need it:

Enables great cost and environmental saving opportunities for UKS users.

Special notes for your reviewer:

  • PR author has very limited experience with Go lang, so please do not get frustrated with the coding and naming convention mistakes, just let me know what changes would be needed.
  • Node group with MinSize: 0 is only allowed when at least one node group with MinSize >= 1 exists
    This is because the UpC cluster auto-scaler is not running inside control plane.
  • PR author does not have access to information about node templates used by UKS clusters
    NodeInfo returned by TemplateNodeInfo is very likely incomplete.
  • Node group TemplateNodeInfo currently returns cloudprovider.ErrNotImplemented when group size is > 0, this may be against conventions ? (but preserves old behaviour for node groups that have node)
  • PR author is not sure how NodeGroupSpec.SupportScaleToZero should treated:
    This property was previously set to false by using constant nodeGroupMinSize (1) this constant has been changed to 0 and therefore the NodeGroupSpec.SupportScaleToZero is always set to true. Technically UKS node groups support scaling to 0. Author is just bit confused if the NodeGroupSpec.SupportScaleToZero should be set to false for node groups that have MinSize >= 1
  • New upcloud-go-api components have been added running: vendor.sh github.com/upcloudltd/upcloud-go-api/v6/ v6.5.0

Does this PR introduce a user-facing change?

### Enables scaling down node groups to zero
Setting node group minimum size to 0 requires that least one node group has to have minimum size >= 1 

@visdmin
Copy link
Author

visdmin commented Feb 12, 2025

Hi, First time contributor here !

I was told that you are very busy right now and pull request review times could be very long, but I don't mind.
Unfortunately i have very limited experience with Go lang, but I have tried my best to follow the style of the existing code base.

@Pionerd
Copy link

Pionerd commented Mar 2, 2025

Would love to see this merged

@visdmin
Copy link
Author

visdmin commented Mar 13, 2025

Would love to see this merged

Hi @Pionerd, if you are interested in testing these features please let me know.
I can help you to build container image from this branch 👍🏼

Of course this version of the UpC cluster auto-scaler should not be used in any production environment, only after the changes have been audited and merged.

I have been using this version in my development and testing environments with stable results.

@Pionerd
Copy link

Pionerd commented Mar 14, 2025

Sure, if you supply me with an image or instructions how to build it, I'll test it out. In prod ;-)

@visdmin
Copy link
Author

visdmin commented Apr 28, 2025

Sure, if you supply me with an image or instructions how to build it, I'll test it out. In prod ;-)

Hi @Pionerd, sorry for the huge delay.
Let me reiterate I do not recommend this version for production use, but you do you 😅 👍🏼

I built the image for you:
https://github.com/visdmin/autoscaler/pkgs/container/autoscaler

docker pull ghcr.io/visdmin/autoscaler:v1.31.0-alpha1

Tips for usage:

You need to have at least 1 node group with minimum node group > 0 (1), this is to ensure that autoscaler does not scale down all your worker nodes (and you do not end up with cluster that has 0 worker nodes).

Configuration example ca-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
      ....
      containers:
        - name: cluster-autoscaler
          image: ghcr.io/visdmin/autoscaler:v1.31.0-alpha1
          imagePullPolicy: "Always"
          command:
            - /cluster-autoscaler
            - --cloud-provider=upcloud
            - --stderrthreshold=info
            - --scale-down-enabled=true
            - --v=4
            - --nodes=3:3:nodegroup-1
            - --nodes=0:3:nodegroup-2
            .....

You can also utilized UpCloud worker node group labels and taints to control what workloads are scheduled on the worker node groups that are controller by the cluster autoscaler.

Currently you cannot create UpCloud worker node group with taints using the hub UI, but you can create one with taints via the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants