Skip to content

[Bug] build-scheduler.sh may hang forever waiting for an exact Pending-pods count that may never be reached #13

@Mapalmeira

Description

@Mapalmeira

When running the emulation, build-scheduler.sh starts the custom scheduler and then waits until the number of Pending pods in the cluster reaches an exact target (DESIRED_PENDING_PODS). In practice, this condition can fail to ever become true due to normal scheduling variation and delays in measurement. As a result, the script can loop indefinitely and never proceed to the next steps, effectively stalling the whole emulation.

Sometimes (seemingly by chance), the number of Pending pods drops below the expected value and never hits it exactly again. Example:

[SCHEDULER] [INFO] There are still 1613 pending pods. Expected 1614 pending pods. Checking again in 30 seconds...
[SCHEDULER] [INFO] There are still 1608 pending pods. Expected 1614 pending pods. Checking again in 30 seconds...
[SCHEDULER] [INFO] There are still 1601 pending pods. Expected 1614 pending pods. Checking again in 30 seconds...
...
[SCHEDULER] [INFO] There are still 1445 pending pods. Expected 1614 pending pods. Checking again in 30 seconds...
[SCHEDULER] [INFO] There are still 924 pending pods. Expected 1614 pending pods. Checking again in 30 seconds...
[SCHEDULER] [INFO] There are still 567 pending pods. Expected 1614 pending pods. Checking again in 30 seconds...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions