Trigger unit tests for docker images upload #2924

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

xibinliu wants to merge 1 commit into main from xibin/ci

+264 −133

Collaborator

xibinliu commented Jan 9, 2026 •

edited

Loading

Description

Trigger the CI with the stable and nightly images

Tests

Image Build and Test workflows run by this PR: here

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

xibinliu force-pushed the xibin/ci branch 5 times, most recently from 76827fc to d932d98 Compare

January 9, 2026 03:34

xibinliu marked this pull request as ready for review

January 9, 2026 04:04

xibinliu requested review from bvandermoon, gobbleturk, khatwanimohit, parambole, richjames0 and shralex as code owners

January 9, 2026 04:04

xibinliu force-pushed the xibin/ci branch 3 times, most recently from 0f982ee to 6e5c3de Compare

January 9, 2026 04:45

codecov bot commented Jan 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

xibinliu force-pushed the xibin/ci branch from 6e5c3de to 650d10b Compare

January 9, 2026 21:12

SurbhiJainUSC reviewed

View reviewed changes

.github/workflows/UploadDockerImages.yml Outdated Show resolved Hide resolved

SurbhiJainUSC reviewed

View reviewed changes

.github/workflows/build_and_test_maxtext.yml Outdated Show resolved Hide resolved

SurbhiJainUSC reviewed

View reviewed changes

.github/workflows/UploadDockerImages.yml Outdated Show resolved Hide resolved

xibinliu force-pushed the xibin/ci branch 4 times, most recently from 86f6416 to 3909d48 Compare

January 22, 2026 21:08

SurbhiJainUSC reviewed

View reviewed changes

.github/workflows/UploadDockerImages.yml Outdated Show resolved Hide resolved

SurbhiJainUSC reviewed

View reviewed changes

.github/workflows/UploadDockerImages.yml Outdated Show resolved Hide resolved

xibinliu force-pushed the xibin/ci branch from 3909d48 to 0e378d2 Compare

January 22, 2026 23:24

SurbhiJainUSC approved these changes

View reviewed changes

shralex reviewed

View reviewed changes

.github/workflows/UploadDockerImages.yml Outdated Show resolved Hide resolved

.github/workflows/run_tests_against_package.yml Outdated Show resolved Hide resolved

.github/workflows/UploadDockerImages.yml Show resolved Hide resolved

xibinliu force-pushed the xibin/ci branch from 0e378d2 to acafa4d Compare

January 23, 2026 19:53

xibinliu force-pushed the xibin/ci branch from acafa4d to 472aa64 Compare

January 23, 2026 20:21

github-advanced-security bot found potential problems

View reviewed changes

.github/workflows/run_tests_coordinator.yml Fixed Show fixed Hide fixed

xibinliu force-pushed the xibin/ci branch 2 times, most recently from a0b8391 to f237ce3 Compare

January 23, 2026 20:29

github-advanced-security bot found potential problems

View reviewed changes

.github/workflows/run_tests_coordinator.yml Fixed Show fixed Hide fixed

.github/workflows/run_tests_coordinator.yml Fixed Show fixed Hide fixed

xibinliu force-pushed the xibin/ci branch 2 times, most recently from 1db9c4e to acdad52 Compare

January 23, 2026 20:57

github-advanced-security bot found potential problems

View reviewed changes

.github/workflows/run_tests_coordinator.yml Fixed Show fixed Hide fixed

xibinliu force-pushed the xibin/ci branch 3 times, most recently from 4c71073 to 6f617e1 Compare

January 24, 2026 05:56

github-advanced-security bot found potential problems

View reviewed changes

.github/workflows/run_tests_coordinator.yml Fixed Show fixed Hide fixed

xibinliu force-pushed the xibin/ci branch 7 times, most recently from 47a7f7d to 2f286d7 Compare

January 24, 2026 06:35

shralex reviewed

View reviewed changes

.github/workflows/UploadDockerImages.yml Outdated Show resolved Hide resolved

.github/workflows/UploadDockerImages.yml

    
              # It runs automatically daily at 12am UTC, on Pull Requests, or manually via Workflow Dispatch.

              name: Build Images

              name: Build and Test Images

Collaborator

shralex Jan 24, 2026

Should we split this file or add an option to distinguish stable image building and testing from nightly building and testing ? Assuming that nightly might be frequently broken, my concern is that bundling the two would prevent us from publishing stable images to pypi

Collaborator Author

xibinliu Jan 27, 2026

The build and tests jobs are independent for stable and nightly images, so they won't impact each other. If we want to keep them in one file depends on how the pypi publishing process use the output from this workflow. @SurbhiJainUSC could you comment here?

Collaborator

SurbhiJainUSC Jan 27, 2026

This workflow uploads the docker image to PyPI. This is independent of pypi release process. Also, we will be able to release stable docker image to GCR even if nightly docker image fails.

.github/workflows/UploadDockerImages.yml Outdated

    
                  with:

                    flavor: ${{ matrix.flavor }}

                    base_image: ${{ matrix.image }}:${{ needs.setup.outputs.image_date }}

                    is_scheduled_run: ${{ github.event_name == 'schedule' }}

Collaborator

shralex Jan 24, 2026

this should probably just be true throughout this file. The idea for this flag is to run more tests when this isn't affecting a PR

Collaborator Author

xibinliu Jan 27, 2026

The trigger could be scheduled or manual for this workflow. But it probably okay to run full tests even with manual trigger. I now changed is_scheduled_run to true throughout the file.

.github/workflows/UploadDockerImages.yml Outdated Show resolved Hide resolved

.github/workflows/run_tests_coordinator.yml Outdated

    
                      required: false

                      type: boolean

                      default: false

                    worker_group:

Collaborator

shralex Jan 24, 2026

why do we need to make worker_group and total_workers parameters ? do these change across invocations from uploadDockerImages and build_and_test_maxtext ?

Collaborator Author

xibinliu Jan 27, 2026

Good point. I removed the "worker_group" and "total_workers" as parameters from the coordinator, and set them based on the other input (cpu or not).

.github/workflows/run_tests_coordinator.yml

    
                          || (contains(inputs.flavor, 'gpu') && 'a100-40gb-4' || 'X64') }}

                    cloud_runner: >-

                      ${{ contains(inputs.flavor, 'tpu') && 'linux-x86-ct6e-180-4tpu'

Collaborator

shralex Jan 24, 2026

Instead of using complex selection operators here consider creating a setup job that maps the flavor to specific outputs that would then be used as parameters. This centralizes the logic and provides implicit validation. Something like this:

configure:
outputs:
device_type: ${{ steps.map.outputs.device_type }}
device_name: ${{ steps.map.outputs.device_name }}
# ... other outputs
steps:
- id: map
run: |
case "${{ inputs.flavor }}" in
"tpu-unit")
echo "device_type=tpu" >> $GITHUB_OUTPUT
echo "device_name=v6e-4" >> $GITHUB_OUTPUT
# also for pytest markers etc
;;
"gpu-unit")
echo "device_type=cuda12" >> $GITHUB_OUTPUT
echo "device_name=a100-40gb-4" >> $GITHUB_OUTPUT
;;
*)
echo "::error::Unsupported flavor: ${{ inputs.flavor }}"
exit 1
;;
esac

execute-test-package:
needs: configure
uses: ./.github/workflows/run_tests_against_package.yml
with:
device_type: ${{ needs.configure.outputs.device_type }}
# ...

Collaborator Author

xibinliu Jan 27, 2026

The "configure" job has to be run a VM, isn't this heavier than just using selection operators?

xibinliu force-pushed the xibin/ci branch from 2f286d7 to fd9a385 Compare

January 27, 2026 17:50


          Trigger unit tests for docker images upload workflow

7880fc4

xibinliu force-pushed the xibin/ci branch from fd9a385 to 7880fc4 Compare

January 27, 2026 18:02

SurbhiJainUSC self-requested a review

January 27, 2026 23:06

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

shralex shralex left review comments

gobbleturk Awaiting requested review from gobbleturk gobbleturk is a code owner

khatwanimohit Awaiting requested review from khatwanimohit khatwanimohit is a code owner

parambole Awaiting requested review from parambole parambole is a code owner

bvandermoon Awaiting requested review from bvandermoon bvandermoon is a code owner

richjames0 Awaiting requested review from richjames0 richjames0 is a code owner

SurbhiJainUSC Awaiting requested review from SurbhiJainUSC

At least 2 approving reviews are required to merge this pull request.

Labels

None yet