Skip to content

[TMTensor][LinalgExt] Fuse causal mask into attention decomposition#23999

Draft
keshavvinayak01 wants to merge 1 commit intoiree-org:mainfrom
keshavvinayak01:users/keshavvinayak01/causal-mask-fusing
Draft

[TMTensor][LinalgExt] Fuse causal mask into attention decomposition#23999
keshavvinayak01 wants to merge 1 commit intoiree-org:mainfrom
keshavvinayak01:users/keshavvinayak01/causal-mask-fusing

Conversation

@keshavvinayak01
Copy link
Copy Markdown
Contributor

Add is_causal attribute to AttentionOp and OnlineAttentionOp. When set, the causal condition (k2 > m) is computed inline from loop indices during decomposition, fused into the max-reduction and exp steps. This eliminates the need for a materialized N×N mask tensor.

Add is_causal attribute to AttentionOp and OnlineAttentionOp. When set,
the causal condition (k2 > m) is computed inline from loop indices
during decomposition, fused into the max-reduction and exp steps.
This eliminates the need for a materialized N×N mask tensor.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Keshav Vinayak Jha <keshavvinayakjha@gmail.com>
@keshavvinayak01 keshavvinayak01 changed the title [LinalgExt] Fuse causal mask into attention decomposition [TMTensor][LinalgExt] Fuse causal mask into attention decomposition Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant