Add GPU-side Gumbel-max sampling for CUDA graph compatibility #500
| Job | Run time |
|---|---|
| 12m 48s | |
| 34m 59s | |
| 10m 9s | |
| 29m 38s | |
| 35m 5s | |
| 8m 43s | |
| 1h 3m 21s | |
| 12m 19s | |
| 10m 0s | |
| 10m 26s | |
| 11m 45s | |
| 11m 7s | |
| 10m 32s | |
| 10m 33s | |
| 10m 19s | |
| 10m 4s | |
| 10m 19s | |
| 10m 1s | |
| 10m 39s | |
| 11m 50s | |
| 9m 50s | |
| 5h 44m 27s |