[Torch] Handle dynamic head dimensions for attention#23636
Merged
IanWood1 merged 6 commits intoiree-org:mainfrom Apr 2, 2026
Merged
[Torch] Handle dynamic head dimensions for attention#23636IanWood1 merged 6 commits intoiree-org:mainfrom
IanWood1 merged 6 commits intoiree-org:mainfrom
Conversation
2566450 to
d556726
Compare
keshavvinayak01
approved these changes
Mar 6, 2026
Contributor
keshavvinayak01
left a comment
There was a problem hiding this comment.
Ah, I mistakenly opened up a duplicate #23680 as well. This implementation LGTM!
Contributor
|
nit though, would it make sense to add this? |
Compute the attention scale as rsqrt(head_dim) using a single code path for both static and dynamic head dimensions. For static dims, createOrFold constant-folds the dim/cast/sitofp chain; for dynamic dims, the full runtime computation is emitted. Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
math::RsqrtOp lacks a constant folder, so the static head dim case was no longer fully folded to a single constant. Switch to math::SqrtOp + arith::DivFOp which both have folders, restoring the original folded output (e.g. arith.constant 5.000000e-01). Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
This reverts commit d556726. Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
d556726 to
837958e
Compare
Member
Author
|
I added a folder for |
Member
Author
My thought was that using |
Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
keshavvinayak01
approved these changes
Apr 1, 2026
Contributor
keshavvinayak01
left a comment
There was a problem hiding this comment.
Shouldn't we be merging this?
Contributor
|
Seems like a genuine test failure: |
createOrFold folds the entire rsqrt(head_dim) computation at compile time when the head dimension is static, producing a single constant (e.g. 0.5 for head_dim=4) rather than the unfolded arith.constant + math.rsqrt pattern. Update CHECK lines accordingly. Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
0ea142b to
f9bda38
Compare
Member
Author
Fixed, I forgot about my changes that added a folder for math.rsqrt. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uses createOrFold to handle the dynamic case by creating a
tensor.dim+math.rsqrtop.