Skip to content

Port "[Bugfix] Fix bucketing of query + num_blocks neighbor expansion" #350#482

Closed
iboiko-habana wants to merge 2 commits intovllm-project:v0.10.2_nextfrom
iboiko-habana:port_pr350_pr355
Closed

Port "[Bugfix] Fix bucketing of query + num_blocks neighbor expansion" #350#482
iboiko-habana wants to merge 2 commits intovllm-project:v0.10.2_nextfrom
iboiko-habana:port_pr350_pr355

Conversation

@iboiko-habana
Copy link
Copy Markdown
Collaborator

No description provided.

@github-actions
Copy link
Copy Markdown

✅ CI Passed

All checks passed successfully against the following vllm commit:
01efc7ef781391e744ed08c3292817a773d654e6

1 similar comment
@github-actions
Copy link
Copy Markdown

✅ CI Passed

All checks passed successfully against the following vllm commit:
01efc7ef781391e744ed08c3292817a773d654e6

@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 6, 2025

✅ CI Passed

All checks passed successfully against the following vllm commit:
01efc7ef781391e744ed08c3292817a773d654e6

kamil-kaczor added a commit that referenced this pull request Mar 18, 2026
Rebundled with 37min target based on build #482 actual timings:
- Merged unit_tests + eagle3 + calibration into one g3.l bundle (~34min)
- Merged awq_gptq + load_generate_llama70b (~37min)
- Merged qwen3_moe + deepseek_ernie_scaling (~36min)
- Merged perf + qwen3_fp8 + preemption + mistral3 + dsv2_blockfp8 (~34min)
- Kept sleep_granite (~36min), qwen25_vl_offloading (~31min) unchanged

Cards: 94 (original) -> 62 (now), pods: 23 -> 9

Signed-off-by: Kamil Kaczor <kamil.kaczor@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants