Skip to content

Support swiglustep and mul#199

Open
Dboyqiao wants to merge 3 commits intovllm-project:mainfrom
Dboyqiao:dev/zhefeng/swiglustep_and_mul
Open

Support swiglustep and mul#199
Dboyqiao wants to merge 3 commits intovllm-project:mainfrom
Dboyqiao:dev/zhefeng/swiglustep_and_mul

Conversation

@Dboyqiao
Copy link
Copy Markdown

Essential Elements of an Effective PR Description Checklist

  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS ABOVE HAVE BEEN CONSIDERED.

Purpose

Support swiglustep and mul

Test Plan

python -m pytest tests/test_swiglustep_and_mul.py -v

Test Result

Pass

(Optional) Documentation Update

BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing (anything written below this line will be removed by GitHub Actions)

Copilot AI review requested due to automatic review settings March 18, 2026 03:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for a new fused activation (swiglustep_and_mul) across the XPU extension stack (C++/SYCL kernel → Torch binding → Python dispatch), plus accompanying unit test and benchmark.

Changes:

  • Add swiglustep activation option to the fused MoE Python interface.
  • Register and bind a new torch.ops._C.swiglustep_and_mul XPU operator and implement its SYCL kernel.
  • Add unit test coverage and a benchmark script for the new op.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
vllm_xpu_kernels/fused_moe_interface.py Routes activation="swiglustep" to the new fused op.
csrc/activation.cpp Implements swiglustep_and_mul device function + kernel + launcher.
csrc/torch_bindings.cpp Registers the new op schema and XPU implementation.
csrc/ops.h Declares the new C++ op entrypoint.
tests/register_ops.py Adds a Python test wrapper for the new op.
tests/ops/swiglustep_and_mul_op.py Adds a CustomOp test harness + native reference implementation.
tests/test_swiglustep_and_mul.py Adds pytest coverage + opcheck for the new op.
benchmark/benchmark_swiglustep_and_mul.py Adds performance benchmarking for the op vs native/compile.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +12 to +15
XPU_DEVICES = [
f"xpu:{i}" for i in range(1 if torch.xpu.device_count() == 1 else 2)
]

Comment on lines +54 to +57
torch.set_default_device(device)
x = torch.randn(num_tokens, 2 * d, dtype=dtype)

layer = SwigluStepAndMul()
Comment on lines +76 to +77
d = x.shape[-1] // 2
output_shape = (x.shape[:-1] + (d, ))
Comment on lines +298 to +299
elif activation == "swiglustep":
torch.ops._C.swiglustep_and_mul(act_output, gemm1_output, 7.0)
Comment on lines +11 to +13
from tests.ops.swiglustep_and_mul_op import SwigluStepAndMul


Signed-off-by: Qiao, Zhefeng <zhefeng.qiao@intel.com>
@Dboyqiao Dboyqiao force-pushed the dev/zhefeng/swiglustep_and_mul branch from 21a7552 to bef820a Compare March 18, 2026 07:11
torch::Tensor& input, // [..., 2 * d]
double limit) {
LAUNCH_SWIGLUSTEP_AND_MUL(vllm::swiglustep_and_mul, limit);
} No newline at end of file
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add blank line

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Signed-off-by: Qiao, Zhefeng <zhefeng.qiao@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants