whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-06-22 08:30:07 +00:00

Files

Jeff Bolz a753a82462 vulkan: get the first command buffer submitted sooner (llama/10499)

This is an incremental improvement over #9118 to get work to the GPU a bit
sooner. The first part is to start with a smaller number of nodes before
the first submit, and ramp it up to the current 100 nodes/submit. The
second part is to reduce the dryrun overhead for all the nodes that just
need to request descriptor space.

With these changes I get around 1-2% speedup on RTX 4070 combined with my
old Haswell-era CPU.

2024-12-08 20:14:35 +02:00

include

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)

2024-12-08 20:14:35 +02:00

src

vulkan: get the first command buffer submitted sooner (llama/10499)

2024-12-08 20:14:35 +02:00

.gitignore

whisper : reorganize source code + improve CMake (#2256 )

2024-06-26 19:34:09 +03:00

CMakeLists.txt

ggml : add support for dynamic loading of backends (llama/10469)

2024-12-08 20:14:35 +02:00