Skip to content

Add batch support for speculative decoding and it's pruning#38

Merged
HaibaraAiChan merged 1 commit intomainfrom
spec_dec
Feb 3, 2026
Merged

Add batch support for speculative decoding and it's pruning#38
HaibaraAiChan merged 1 commit intomainfrom
spec_dec

Conversation

@xiongxu1998
Copy link
Copy Markdown
Collaborator

  1. add batch support for speculative decoding. For different sample in a batch, padding tokens are used in inputs.
  2. add batch support for speculative decoding pruning. When transferring hidden states between different server, flatten batch samples into a single sequence.

@HaibaraAiChan HaibaraAiChan merged commit 8879eaa into main Feb 3, 2026
1 check passed
@HaibaraAiChan HaibaraAiChan deleted the spec_dec branch February 3, 2026 04:51
JiuChen0 pushed a commit to JiuChen0/BloomBee that referenced this pull request Mar 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants