Johannes Gäßler b17ba2815b CUDA: faster q2_K, q3_K MMQ + int8 tensor cores (llama/7921)
* CUDA: faster q2_K, q3_K MMQ + int8 tensor cores

* try CI fix

* try CI fix

* try CI fix

* fix data race

* rever q2_K precision related changes
2024-06-16 18:19:48 +03:00
..
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-04-07 16:15:57 +03:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00