Johannes Gäßler 4be936b88b CUDA: generalize FP16 fattn vec kernel (llama/7061)
* CUDA: generalize FP16 fattn vec kernel

* disable unsupported head sizes for AMD in test

* try AMD fix

* fix batch size 2-8

* partially revert changes
2024-05-13 11:02:26 +03:00
..
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-04-07 16:15:57 +03:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00