Johannes Gäßler
5582039d0a
CUDA: quantized KV support for FA vec (llama/7527)
* CUDA: quantized KV support for FA vec
* try CI fix
* fix commented-out kernel variants
* add q8_0 q4_0 tests
* fix nwarps > batch size
* split fattn compile via extern templates
* fix flake8
* fix metal tests
* fix cmake
* make generate_cu_files.py executable
* add autogenerated .cu files
* fix AMD
* error if type_v != FP16 and not flash_attn
* remove obsolete code
2024-06-16 18:19:48 +03:00
..
2024-06-16 18:19:48 +03:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-04-07 16:15:57 +03:00
2024-03-27 18:55:10 +02:00
2024-05-13 11:02:26 +03:00
2024-03-27 18:55:10 +02:00
2024-05-13 11:02:26 +03:00
2024-03-27 18:55:10 +02:00
2024-06-16 18:19:48 +03:00
2024-06-16 18:19:48 +03:00
2024-03-27 18:55:10 +02:00
2024-06-16 18:19:48 +03:00
2024-04-09 20:26:18 +03:00
2024-05-13 11:02:26 +03:00
2024-05-13 11:02:26 +03:00
2024-04-09 20:26:18 +03:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-06-16 18:19:48 +03:00
2024-04-07 16:15:57 +03:00
2024-06-16 18:19:48 +03:00
2024-06-16 18:19:48 +03:00
2024-06-16 18:19:48 +03:00
2024-06-16 18:19:48 +03:00
2024-06-16 18:19:48 +03:00
2024-05-14 19:16:29 +03:00
2024-06-16 18:19:48 +03:00
2024-05-14 19:16:29 +03:00
2024-06-16 18:19:48 +03:00
2024-06-16 18:19:48 +03:00
2024-06-16 18:19:48 +03:00
2024-05-13 11:02:26 +03:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-06-16 18:19:48 +03:00
2024-03-27 18:55:10 +02:00
2024-06-11 17:39:01 +03:00
2024-03-27 18:55:10 +02:00
2024-06-16 18:19:48 +03:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-04-09 20:26:18 +03:00
2024-04-09 20:26:18 +03:00
2024-06-16 18:19:48 +03:00
2024-03-27 18:55:10 +02:00
2024-05-13 11:02:26 +03:00
2024-03-27 18:55:10 +02:00
2024-06-16 18:19:48 +03:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-05-13 11:02:26 +03:00
2024-05-13 11:02:26 +03:00
2024-06-16 18:19:48 +03:00
2024-03-27 18:55:10 +02:00
2024-06-16 18:19:48 +03:00