whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-06-01 23:10:46 +00:00

History

Eve 164f13c6a9 vulkan: scale caching for k quants + misc fixes (llama/11081)

* q6_k scale caching

* 16 bit unpack

* q4_k test (slow)

* revert it

* q3_k

* q2_k

* little stuff

* try precalculating products of a and q2_k scales

* Revert "try precalculating products of a and q2_k scales"

This reverts commit 65110b81f23f66331a50c6e889a7c1ab9470a86b.

* unpack should be u16, add vim swap to gitignore (about time)

* better q4_k scales

* q5_k

* better q6_k with separate paths for all threads and partial threads in use, plus some more optimizations

* q2_k better dequant

* q3_k optimizations

* q3_k use hmask simd from cpu avx version

* make the caches happy

* q3_k separate out calculation

* q2_k separate out

* little stuff

* use calc_superblock everywhere

* q2_k optimize scale calculation

* more barriers

2025-02-03 22:00:57 +02:00

include

RoPE: fix back, CUDA support for back + noncont. (llama/11240)

2025-02-03 22:00:57 +02:00

src

vulkan: scale caching for k quants + misc fixes (llama/11081)

2025-02-03 22:00:57 +02:00

.gitignore

whisper : reorganize source code + improve CMake (#2256 )

2024-06-26 19:34:09 +03:00

CMakeLists.txt

fix: ggml: fix vulkan-shaders-gen build (llama/10448)

2025-02-03 22:00:57 +02:00