Johannes Gäßler e57e95eb0d
CUDA: add FP32 FlashAttention vector kernel (llama/7188)
* CUDA: add FP32 FlashAttention vector kernel

* fixup! CUDA: add FP32 FlashAttention vector kernel

* fixup! fixup! CUDA: add FP32 FlashAttention vector kernel

* fixup! fixup! fixup! CUDA: add FP32 FlashAttention vector kernel
2024-05-14 19:16:29 +03:00
..
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-04-07 16:15:57 +03:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00