[TMTensor][LinalgExt] Fuse causal mask into attention decomposition by keshavvinayak01 · Pull Request #23999 · iree-org/iree

keshavvinayak01 · 2026-04-02T19:39:58Z

Add is_causal attribute to AttentionOp and OnlineAttentionOp. When set, the causal condition (k2 > m) is computed inline from loop indices during decomposition, fused into the max-reduction and exp steps. This eliminates the need for a materialized N×N mask tensor.

Add is_causal attribute to AttentionOp and OnlineAttentionOp. When set, the causal condition (k2 > m) is computed inline from loop indices during decomposition, fused into the max-reduction and exp steps. This eliminates the need for a materialized N×N mask tensor. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Keshav Vinayak Jha <keshavvinayakjha@gmail.com>

keshavvinayak01 mentioned this pull request Apr 2, 2026

[TMTensor] Remove Materialization of causal_masks; add attribute instead llvm/torch-mlir#4520

Draft

keshavvinayak01 changed the title ~~[LinalgExt] Fuse causal mask into attention decomposition~~ [TMTensor][LinalgExt] Fuse causal mask into attention decomposition Apr 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TMTensor][LinalgExt] Fuse causal mask into attention decomposition#23999

[TMTensor][LinalgExt] Fuse causal mask into attention decomposition#23999
keshavvinayak01 wants to merge 1 commit intoiree-org:mainfrom
keshavvinayak01:users/keshavvinayak01/causal-mask-fusing

keshavvinayak01 commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

keshavvinayak01 commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant