whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-01-11 07:23:12 +00:00

Author	SHA1	Message	Date
Johannes Gäßler	7483d2b61c	CUDA: use tensor cores for MMQ (llama/7676) * CUDA: int8 tensor cores for MMQ (legacy quants) * fix out-of-bounds writes * __builtin_assume -> GGML_CUDA_ASSUME * fix writeback returning too early	2024-06-16 18:19:48 +03:00
Johannes Gäßler	760497e1ab	CUDA: revise q8_1 data layout for mul_mat_q (llama/7824)	2024-06-16 18:19:48 +03:00
Johannes Gäßler	e08c62149b	CUDA: refactor mmq, dmmv, mmvq (llama/7716) * CUDA: refactor mmq, dmmv, mmvq * fix out-of-bounds write * struct for qk, qr, qi * fix cmake build * mmq_type_traits	2024-06-16 18:19:48 +03:00
Georgi Gerganov	2948c740a2	sync : ggml (#2001 ) * sync : update scripts * sync : ggml * talk-llama : sync llama.cpp * make : WHISPER_CUBLAS -> WHISPER_CUDA * ci : try to fix sycl build * talk-llama : fix make build	2024-03-27 18:55:10 +02:00