Commit 16beb34
authored
[feature] arm: speed up exp_ps floor step on aarch64 (#6657)
Summary:
Use vrndmq_f32 for floor computation in exp_ps on aarch64 while keeping the legacy fallback path for non-aarch64 targets. This reduces the exp_ps hot-path cost on ARM without changing approximation behavior.1 parent e366f48 commit 16beb34
1 file changed
+4
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
144 | 147 | | |
145 | 148 | | |
146 | 149 | | |
147 | 150 | | |
148 | 151 | | |
149 | 152 | | |
150 | 153 | | |
| 154 | + | |
151 | 155 | | |
152 | 156 | | |
153 | 157 | | |
| |||
0 commit comments