Johannes Gäßler
|
e57e95eb0d
|
CUDA: add FP32 FlashAttention vector kernel (llama/7188)
* CUDA: add FP32 FlashAttention vector kernel
* fixup! CUDA: add FP32 FlashAttention vector kernel
* fixup! fixup! CUDA: add FP32 FlashAttention vector kernel
* fixup! fixup! fixup! CUDA: add FP32 FlashAttention vector kernel
|
2024-05-14 19:16:29 +03:00 |
|