Skip to content

Add Kubernetes support to unified benchmark runner #3537

@andygrove

Description

@andygrove

What is the problem the feature request solves?

Description

Add support for running benchmarks on a Kubernetes cluster using Spark's spark-submit --master k8s://... client mode. This was explored during #3534 but removed as out of scope for the initial PR.

Motivation

The current benchmark runner supports local and standalone Spark clusters via docker-compose. Adding K8s support would enable:

  • Running benchmarks on multi-node clusters with realistic resource constraints
  • Leveraging existing K8s infrastructure (e.g., K3s, EKS, GKE) without managing standalone Spark clusters
  • Better reproducibility via containerized executor pods with defined resource limits

Proposed Scope

  • K8s profile config (conf/profiles/k8s.conf) with spark.master=k8s://..., executor pod templates, and container image settings
  • RBAC manifests (namespace, service account, role, role binding) for the comet-bench namespace
  • PV/PVC definitions for mounting benchmark data and engine JARs into executor pods
  • Documentation for pushing the comet-bench image to a cluster-accessible registry and running benchmarks
  • Validation with at least one TPC-H query on a multi-node cluster (e.g., K3s)

Key Considerations

  • The comet-bench Docker image already includes both Java 8 and Java 17 runtimes and the TPC query files, so it can serve as the executor image
  • Spark client mode requires the driver pod (or host) to be reachable from executor pods — network configuration may vary by cluster
  • Engine JARs (Comet, Gluten) need to be accessible to executors, either baked into the image or mounted via PVCs
  • Gluten requires JAVA_HOME override to Java 8 on all executor pods

Describe the potential solution

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions