Skip to content

Enable GPTQ on Intel GPU#4191

Open
xiaowangintel wants to merge 2 commits intopytorch:mainfrom
xiaowangintel:xw/prototype_gptq
Open

Enable GPTQ on Intel GPU#4191
xiaowangintel wants to merge 2 commits intopytorch:mainfrom
xiaowangintel:xw/prototype_gptq

Conversation

@xiaowangintel
Copy link
Copy Markdown
Collaborator

Summary
This PR enables GPTQ support on Intel GPU.

Previously, GPTQ workflows in torchao were primarily validated on CUDA. This PR extends support to XPU.

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 27, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/4191

Note: Links to docs will display an error until the docs builds have been completed.

❌ 8 New Failures

As of commit 950b4fb with merge base 4611835 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 27, 2026
@xiaowangintel xiaowangintel requested review from jcaip and liangan1 March 27, 2026 08:16
@xiaowangintel xiaowangintel self-assigned this Mar 27, 2026
@xiaowangintel xiaowangintel added module: inference quantize_ api inference flow ciflow/xpu label used to trigger xpu CI jobs labels Mar 27, 2026
if device == "cuda":
torch.cuda.empty_cache()
elif device == "xpu":
torch.xpu.empty_cache()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest to use the torch.acc API

# Save model to generated output directory
print(f"Saving model to {output_dir}...")
tokenizer.save_pretrained(output_dir)
print("model:", model)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print("model:", model)


def main():
args = parse_args()
device = _get_device(args.device)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest to use torch.acc.get_current_xxx to get the current device information instead of add new params --device. it will change the user behevior


@skip_if_lt_x_gpu(2)
@unittest.skipIf(not torch.accelerator.is_available(), "Need GPU available")
@unittest.skipIf(torch.xpu.is_available(), "XPU enablement in progress")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls skip not related UT in another PR.

@jerryzh168
Copy link
Copy Markdown
Contributor

this feature is not complete yet, @xiaowangintel @liangan1 can you restrict the contribution only to stable features for now? they are in https://github.com/pytorch/ao/tree/main/torchao/quantization/quantize_

@liangan1
Copy link
Copy Markdown
Collaborator

this feature is not complete yet, @xiaowangintel @liangan1 can you restrict the contribution only to stable features for now? they are in https://github.com/pytorch/ao/tree/main/torchao/quantization/quantize_

Sure. Thanks for your suggestion. We will focus on these features now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/xpu label used to trigger xpu CI jobs CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: inference quantize_ api inference flow

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants