Skip to content

[FEATURE] Support pluggable application state source for Spark application status detection #7394

@ruanwenjun

Description

@ruanwenjun

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the feature

Hi community,

We are currently using Kyuubi to submit Spark applications to Kubernetes.

In our environment, a Spark application pod may contain multiple containers, for example:

init-container
spark-container
log-collect-container

After the task finishes, the pod does not exit automatically in our scenario. Because of this, we are using:

kyuubi.kubernetes.application.state.source=CONTAINER

to let Kyuubi determine the application state by monitoring container status.

However, we are facing a problem with the current behavior:

If the init-container fails, the spark-container will never start. In this case, Kyuubi cannot correctly perceive the application state change, because the current container-based state detection does not cover this scenario well enough.

From our perspective, this is an important gap for workloads that rely on multiple containers inside the Spark driver pod.

We hope Kyuubi could provide a more extensible way to determine Kubernetes application state, for example by making the application state monitoring logic pluggable.

This would allow users to implement or choose different strategies based on their pod/container layout.

For example, in our scenario, we would like to determine the application state based on a combination of multiple container states:

if init-container fails, the application should be considered FAILED
if spark-container fails, the application should be considered FAILED
only when the relevant containers succeed or reach the expected terminal/running states should the application be considered successful/running

In other words, we need a flexible mechanism to evaluate application state from multiple containers, rather than relying on a single container or a fixed built-in rule.

Motivation

This is especially useful for scenarios where:

a Spark application pod contains multiple business-related containers
init containers are critical for application startup
sidecar or log collection containers affect lifecycle behavior
pod termination behavior does not directly reflect real application completion/failure

We would be happy to help further clarify the use case if needed. Thanks!

Describe the solution

Would it be possible to make the application state monitoring logic pluggable, may be make toApplicationStateAndError method to be an interface?
This could also make Kyuubi more adaptable to more complex Kubernetes deployment patterns.

Additional context

No response

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve.
  • No. I cannot submit a PR at this time.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions