Commit 47455bd
authored
Fix Flash Attention 3 interface for new FA3 return format (#13173)
* Fix Flash Attention 3 interface compatibility for new FA3 versions
Newer versions of flash-attn (after Dao-AILab/flash-attention@ed20940)
no longer return lse by default from flash_attn_3_func. The function
now returns just the output tensor unless return_attn_probs=True is
passed.
Updated _wrapped_flash_attn_3 and _flash_varlen_attention_3 to pass
return_attn_probs and handle both old (always tuple) and new (tensor
or tuple) return formats gracefully.
Fixes #12022
* Simplify _wrapped_flash_attn_3 return unpacking
Since return_attn_probs=True is always passed, the result is
guaranteed to be a tuple. Remove the unnecessary isinstance guard.1 parent 97c2c6e commit 47455bd
1 file changed
+10
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
733 | 733 | | |
734 | 734 | | |
735 | 735 | | |
736 | | - | |
| 736 | + | |
737 | 737 | | |
738 | 738 | | |
739 | 739 | | |
| |||
750 | 750 | | |
751 | 751 | | |
752 | 752 | | |
| 753 | + | |
753 | 754 | | |
| 755 | + | |
754 | 756 | | |
755 | 757 | | |
756 | 758 | | |
| |||
2701 | 2703 | | |
2702 | 2704 | | |
2703 | 2705 | | |
2704 | | - | |
| 2706 | + | |
2705 | 2707 | | |
2706 | 2708 | | |
2707 | 2709 | | |
| |||
2711 | 2713 | | |
2712 | 2714 | | |
2713 | 2715 | | |
| 2716 | + | |
2714 | 2717 | | |
| 2718 | + | |
| 2719 | + | |
| 2720 | + | |
| 2721 | + | |
| 2722 | + | |
2715 | 2723 | | |
2716 | 2724 | | |
2717 | 2725 | | |
| |||
0 commit comments