Qwen3.5 support and optimization plan

We would like to support Qwen3.5 model from functionality perspective firstly and then further optimize kernel performance to improve E2E performance.

## Functionality

- [ ] add XPU gdn op support to vLLM https://github.com/vllm-project/vllm/pull/33657
  - this should be updated and merged after refactor PR https://github.com/vllm-project/vllm/pull/37975
- [ ] support fp32 ssm_state in chunk_fwd_o kernel https://github.com/vllm-project/vllm-xpu-kernels/pull/220

## Performance optimizations

### GDN attention

Base kernel version: https://github.com/vllm-project/vllm-xpu-kernels/pull/156

- [ ] optimize l2norm kernel https://github.com/vllm-project/vllm-xpu-kernels/pull/222
- [ ] optimize chunk_fwd_o kernel
- [ ] optimize grouped gemm kernel

### Layer Norm

- [ ] add sycl kernel for GemmaRMSNorm and RMSNormGated https://github.com/vllm-project/vllm-xpu-kernels/pull/214

### Qwen3 VisionTransformer

(placeholder)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen3.5 support and optimization plan #172

Functionality

Performance optimizations

GDN attention

Layer Norm

Qwen3 VisionTransformer

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Qwen3.5 support and optimization plan #172

Description

Functionality

Performance optimizations

GDN attention

Layer Norm

Qwen3 VisionTransformer

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions