Skip to content

[CI][XPU] enable unit test for XPU device#2814

Closed
DiweiSun wants to merge 25 commits intopytorch:mainfrom
DiweiSun:molly/enable_xpu_ci
Closed

[CI][XPU] enable unit test for XPU device#2814
DiweiSun wants to merge 25 commits intopytorch:mainfrom
DiweiSun:molly/enable_xpu_ci

Conversation

@DiweiSun
Copy link
Copy Markdown
Contributor

Enabling CI testing for the torchao project on the Intel XPU (GPU) platform to ensure functional correctness, performance consistency, and long-term compatibility as both torchao and XPU support evolve.

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Aug 20, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2814

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 10 Cancelled Jobs

As of commit 030121f with merge base 2db4c76 (image):

NEW FAILURE - The following job has failed:

CANCELLED JOBS - The following jobs were cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 20, 2025
Comment on lines +38 to +66
- name: Clean all stopped docker containers
if: always()
shell: bash
run: |
# Prune all stopped containers.
# If other runner is pruning on this node, will skip.
nprune=$(ps -ef | grep -c "docker container prune")
if [[ $nprune -eq 1 ]]; then
docker container prune -f
fi

- name: Runner health check GPU count
if: always()
shell: bash
run: |
ngpu=$(timeout 30 clinfo -l | grep -c -E 'Device' || true)
msg="Please file an issue on pytorch/ao reporting the faulty runner. Include a link to the runner logs so the runner can be identified"
if [[ $ngpu -eq 0 ]]; then
echo "Error: Failed to detect any GPUs on the runner"
echo "$msg"
exit 1
fi

- name: Use following to pull public copy of the image
id: print-ghcr-mirror
shell: bash
run: |
echo "docker pull ${DOCKER_IMAGE}"
docker pull ${DOCKER_IMAGE}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ported done. Please kindly help review.

if-no-files-found: ignore
path: ./**/core.[1-9]*

- name: Teardown XPU
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can reuse the action in pytorch directly

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is literally ported from pytorch

DiweiSun and others added 2 commits August 22, 2025 14:30
Co-authored-by: Wang, Chuanqi <chuanqi.wang@intel.com>
Co-authored-by: Wang, Chuanqi <chuanqi.wang@intel.com>
@liangan1 liangan1 mentioned this pull request Sep 2, 2025
9 tasks
@liangan1 liangan1 added topic: for developers Use this tag if this PR is mainly developer facing ci labels Sep 4, 2025
DiweiSun and others added 2 commits September 4, 2025 14:43
Co-authored-by: Wang, Chuanqi <chuanqi.wang@intel.com>
@DiweiSun DiweiSun changed the title Molly/enable xpu ci [CI][XPU] enable unit test for XPU device Sep 8, 2025
- ciflow/xpu/*
pull_request:
branches:
- main
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove pull-request after review

@chuanqi129
Copy link
Copy Markdown
Contributor

@pytorchbot label "ciflow/xpu"

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Sep 10, 2025

Didn't find following labels among repository labels: ciflow/xpu

@liangan1
Copy link
Copy Markdown
Collaborator

@pytorchbot label "ciflow/xpu"

@pytorch-bot pytorch-bot bot added the ciflow/xpu label used to trigger xpu CI jobs label Sep 15, 2025
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Sep 15, 2025

Unknown label ciflow/xpu.
Currently recognized labels are

  • ciflow/benchmark
  • ciflow/tutorials
  • ciflow/rocm
  • ciflow/4xh100

@pytorch-bot pytorch-bot bot removed the ciflow/xpu label used to trigger xpu CI jobs label Sep 16, 2025
@liangan1
Copy link
Copy Markdown
Collaborator

@pytorchbot label "ciflow/xpu"

@pytorch-bot pytorch-bot bot added the ciflow/xpu label used to trigger xpu CI jobs label Sep 16, 2025
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Sep 16, 2025

Warning: Unknown label ciflow/xpu.
Currently recognized labels are

  • ciflow/benchmark
  • ciflow/tutorials
  • ciflow/rocm
  • ciflow/4xh100

Please add the new label to .github/pytorch-probot.yml

@pytorch-bot pytorch-bot bot removed the ciflow/xpu label used to trigger xpu CI jobs label Sep 17, 2025
@liangan1 liangan1 added the ciflow/xpu label used to trigger xpu CI jobs label Sep 17, 2025
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Sep 17, 2025

Unknown label ciflow/xpu.
Currently recognized labels are

  • ciflow/benchmark
  • ciflow/tutorials
  • ciflow/rocm
  • ciflow/4xh100

@DiweiSun
Copy link
Copy Markdown
Contributor Author

@pytorchbot label "ciflow/xpu"

1 similar comment
@liangan1
Copy link
Copy Markdown
Collaborator

@pytorchbot label "ciflow/xpu"

@liangan1 liangan1 removed the ciflow/xpu label used to trigger xpu CI jobs label Sep 17, 2025
@liangan1
Copy link
Copy Markdown
Collaborator

@pytorchbot label "ciflow/xpu"

@pytorch-bot pytorch-bot bot added the ciflow/xpu label used to trigger xpu CI jobs label Sep 17, 2025
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Sep 17, 2025

Unknown label ciflow/xpu.
Currently recognized labels are

  • ciflow/benchmark
  • ciflow/tutorials
  • ciflow/rocm
  • ciflow/4xh100

@DiweiSun DiweiSun closed this Sep 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci ciflow/xpu label used to trigger xpu CI jobs CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: for developers Use this tag if this PR is mainly developer facing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants