-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Pull requests: Dao-AILab/flash-attention
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Cute,Flex,Fwd] Allow vectorized score_mod definitions
#2236
opened Feb 5, 2026 by
reubenconducts
Loading…
[ROCM] Add support with Infinity Cache (LLC) awareness for performance improvement - [PR#2147 rebased on PR#2178]
#2217
opened Jan 29, 2026 by
tianwyan
Loading…
Add shift scheduler for deterministic full‑mask FA3 bwd on Hopper (sm90)
#2207
opened Jan 23, 2026 by
tie-pilot-qxw
Loading…
Fix compute_block_sparsity import in benchmark_mask_mod
#2190
opened Jan 17, 2026 by
blueberrycongee
Loading…
[Cute,Fwd,Sm100] support irregular qhead / kvhead ratios
#2186
opened Jan 16, 2026 by
timmy-feng
•
Draft
Update mha_fwd.cpp, Normalize the commented-out parameters
#2160
opened Jan 9, 2026 by
breakfei
Loading…
Add FLASH_ATTENTION_FORCE_NON_STABLE_API option to allow building on NVidia Pytorch 25.09 image
#2140
opened Jan 5, 2026 by
jp-gr
Loading…
[ROCM] Fix AMD Triton backend crash when dropout != 0 and return_attn_probs = False
#2111
opened Dec 30, 2025 by
Logiquo
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.