Qwen 3.5 MoE Metal: Use max-sized prefill example for dynamic inputs by manuelcandales · Pull Request #18956 · pytorch/executorch

manuelcandales · 2026-04-16T22:04:22Z

With alloc_graph_input=False, ExecuTorch sets the input tensor's
numel_bound_ from the serialized example size. A small example (T=2)
prevents runtime inputs larger than 2 tokens. Use max_seq_len-1 as
the prefill example size so any prompt length is accepted at runtime.

Authored with Claude.

[ghstack-poisoned]

manuelcandales · 2026-04-16T22:04:23Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2026-04-16T22:04:27Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18956

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

With alloc_graph_input=False, ExecuTorch sets the input tensor's numel_bound_ from the serialized example size. A small example (T=2) prevents runtime inputs larger than 2 tokens. Use max_seq_len-1 as the prefill example size so any prompt length is accepted at runtime. Authored with Claude. ghstack-source-id: 0efb42f ghstack-comment-id: 4263712315 Pull-Request: #18956

[ghstack-poisoned]

With alloc_graph_input=False, ExecuTorch sets the input tensor's numel_bound_ from the serialized example size. A small example (T=2) prevents runtime inputs larger than 2 tokens. Use max_seq_len-1 as the prefill example size so any prompt length is accepted at runtime. Authored with Claude. ghstack-source-id: 6dc014e ghstack-comment-id: 4263712315 Pull-Request: #18956

[ghstack-poisoned]

With alloc_graph_input=False, ExecuTorch sets the input tensor's numel_bound_ from the serialized example size. A small example (T=2) prevents runtime inputs larger than 2 tokens. Use max_seq_len-1 as the prefill example size so any prompt length is accepted at runtime. Authored with Claude. ghstack-source-id: 601c7ed ghstack-comment-id: 4263712315 Pull-Request: #18956

[ghstack-poisoned]

With alloc_graph_input=False, ExecuTorch sets the input tensor's numel_bound_ from the serialized example size. A small example (T=2) prevents runtime inputs larger than 2 tokens. Use max_seq_len-1 as the prefill example size so any prompt length is accepted at runtime. Authored with Claude. ghstack-source-id: 601c7ed ghstack-comment-id: 4263712315 Pull-Request: #18956

[ghstack-poisoned]

With alloc_graph_input=False, ExecuTorch sets the input tensor's numel_bound_ from the serialized example size. A small example (T=2) prevents runtime inputs larger than 2 tokens. Use max_seq_len-1 as the prefill example size so any prompt length is accepted at runtime. Authored with Claude. ghstack-source-id: 601c7ed ghstack-comment-id: 4263712315 Pull-Request: #18956

[ghstack-poisoned]

With alloc_graph_input=False, ExecuTorch sets the input tensor's numel_bound_ from the serialized example size. A small example (T=2) prevents runtime inputs larger than 2 tokens. Use max_seq_len-1 as the prefill example size so any prompt length is accepted at runtime. Authored with Claude. ghstack-source-id: 601c7ed ghstack-comment-id: 4263712315 Pull-Request: #18956

github-actions · 2026-04-21T18:50:37Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

manuelcandales added 24 commits April 14, 2026 12:25

Update

a3a42e4

[ghstack-poisoned]

Update

1c965c6

[ghstack-poisoned]

Update

1be53ab

[ghstack-poisoned]

Update

47cbe76

[ghstack-poisoned]

Update

805a09d

[ghstack-poisoned]

Update

5306c5a

[ghstack-poisoned]

Update

638edaa

[ghstack-poisoned]

Update

ca524a8

[ghstack-poisoned]

Update

958712e

[ghstack-poisoned]

Update

eba74c4

[ghstack-poisoned]

Update

c9ecdde

[ghstack-poisoned]

Update

c222005

[ghstack-poisoned]

Update

982d0d9

[ghstack-poisoned]

Update

e7a7acc

[ghstack-poisoned]

Update

5530242

[ghstack-poisoned]

Update

59f88db

[ghstack-poisoned]

Update

1fbb94f

[ghstack-poisoned]

Update

60ca500

[ghstack-poisoned]

Update

d70d646

[ghstack-poisoned]

Update

d80da37

[ghstack-poisoned]

Update

598c58f

[ghstack-poisoned]

Update

f8ff857

[ghstack-poisoned]

Update

ae7a13e

[ghstack-poisoned]

Update

58fe35f

[ghstack-poisoned]

manuelcandales requested review from larryliu0820, lucylq and mergennachin as code owners April 16, 2026 22:04

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 16, 2026

manuelcandales added 4 commits April 20, 2026 16:03

Update

ff92256

[ghstack-poisoned]

Update

b9b75e3

[ghstack-poisoned]

Update

d761fdb

[ghstack-poisoned]

Update

fd0eb6e

[ghstack-poisoned]

manuelcandales added 5 commits April 21, 2026 13:55

Update

f8ebcfb

[ghstack-poisoned]

Update

4cf31c8

[ghstack-poisoned]

Update

ba0e56e

[ghstack-poisoned]

Update

3285bb2

[ghstack-poisoned]

Update

24dd7b5

[ghstack-poisoned]

manuelcandales added 4 commits April 21, 2026 14:01

Update

187e4f5

[ghstack-poisoned]

Update

23bec62

[ghstack-poisoned]

Update

f031916

[ghstack-poisoned]

Update

267342c

[ghstack-poisoned]

manuelcandales added 3 commits April 21, 2026 14:13

Update

c53ecc6

[ghstack-poisoned]

Update

3b7f7ce

[ghstack-poisoned]

Update

4f2353c

[ghstack-poisoned]

manuelcandales added 2 commits April 21, 2026 14:43

Update

f697e84

[ghstack-poisoned]

Update

6f251f1

[ghstack-poisoned]

Base automatically changed from gh/manuelcandales/176/head to main April 21, 2026 18:49

manuelcandales requested a review from kirklandsign as a code owner April 21, 2026 18:49

Update

987b7bb

[ghstack-poisoned]

manuelcandales merged commit f13b783 into main Apr 21, 2026
182 of 188 checks passed

manuelcandales deleted the gh/manuelcandales/177/head branch April 21, 2026 18:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen 3.5 MoE Metal: Use max-sized prefill example for dynamic inputs#18956

Qwen 3.5 MoE Metal: Use max-sized prefill example for dynamic inputs#18956
manuelcandales merged 79 commits intomainfrom
gh/manuelcandales/177/head

manuelcandales commented Apr 16, 2026

Uh oh!

manuelcandales commented Apr 16, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

manuelcandales commented Apr 16, 2026

Uh oh!

manuelcandales commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18956

Uh oh!

github-actions Bot commented Apr 21, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

manuelcandales commented Apr 16, 2026 •

edited

Loading

pytorch-bot Bot commented Apr 16, 2026 •

edited

Loading

This PR needs a `release notes:` label