-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Description
Create this issue to track vLLM-XPU roadmap, feature development plan. will try to move to vLLM main repo.
- kernel migration. [RFC]: XPU kernel migration to vllm-xpu-kernels vllm#33214
- torch accelerator API replacement. [RFC]: Replace
torch.cudaAPI withtorch.acceleratorfor better hardware compatiblity. vllm#30679 - (WIP)xpu graph functionality support [Feature][XPU]: XPU graph support vllm#26970
- MLA support [XPU] support MLA model on Intel GPU vllm#37143
- sparse MLA support @wuxun-zhang
- WoQ compressed tensor support on BMG (Wint4A16/Wfp8A16, gemm/moe_gemm)
- xpu CI pipeline optimization [RFC][XPU]: Enable Intel XPU CI for vLLM vllm#37305
- xpu dockerfile refine. [XPU]Replace pip in docker.xpu with uv pip vllm#31112
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels