Skip to content

Metric-agent shows wrong metrics in case of multiple virtual-nodes #3209

@fra98

Description

@fra98

Is there an existing issue for this?

  • I have searched the existing issues

Version

>= v0.9.0 (since multiple virtual nodes support)

What happened?

Right now, metric-agent aggregates (sum) node metrics usage (cpu, memory) from all nodes in the cluster.

This can be an issue if the consumer cluster has multiple virtual nodes targeting the same cluster. In such a case, metrics for each node will refer to the sum of all nodes in the same cluster, causing resource usage to be meaningless.

This is not trivial since in liqo there is no explicit mapping between liqo. However, if a virtualnode is specifying a nodeSelector in the offloadingPatch, there is an implicit virtual node -> remote node(s) mapping. Therefore in such case we can scrape metrics by filtering only remote nodes targeted by the selector. This makes sense especially for use-cases where there is a 1-1 mapping, since resource usage for the virtual node will match the actual one in the remote node.

Relevant log output

How can we reproduce the issue?

  1. Peer two clusters
  2. Create two virtual nodes
  3. Run kubectl top nodes
  4. You will see virtual node metrics are all equal and correspond to the sum of all remote nodes, even if a virtualnode has a nodeSelector targeting only a subset of the remote nodes

Provider or distribution

Any

CNI version

Any

Kernel Version

Any

Kubernetes Version

Any

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugReport a bug encountered while operating Liqo

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions