whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-05-05 02:02:52 +00:00

History

Georgi Gerganov ab36d02560 metal : support permuted matrix multiplicaions (llama/10033)

* metal : support permuted matrix multiplicaions

ggml-ci

* cont : use nb01 directly for row steps

ggml-ci

* cont : add comments [no ci]

* metal : minor refactor

* metal : minor

2024-11-01 10:19:05 +02:00

ggml-cann

cann: fix crash when llama-bench is running on multiple cann devices (llama/9627)

2024-10-03 12:22:17 +03:00

ggml-cuda

CUDA: fix MMQ for non-contiguous src0, add tests (llama/10021)

2024-11-01 10:19:05 +02:00

ggml-sycl

fix mul_mat_vec_q and *_vec_q error (llama/9939)

2024-11-01 10:19:05 +02:00

kompute-shaders

whisper : reorganize source code + improve CMake (#2256 )

2024-06-26 19:34:09 +03:00

vulkan-shaders

vulkan : argsort barriers must be under uniform control flow (ggml/951)

2024-10-03 12:22:17 +03:00

CMakeLists.txt

add amx kernel for gemm (llama/8998)

2024-11-01 10:19:05 +02:00

ggml-aarch64.c

ggml : add run-time detection of neon, i8mm and sve (llama/9331)

2024-10-03 12:22:17 +03:00

ggml-aarch64.h

ggml : add ggml-aarch64 (ggml/0)

2024-08-08 22:48:46 +03:00

ggml-alloc.c

ggml : move more prints to the ggml log system (llama/9839)

2024-11-01 10:19:05 +02:00

ggml-backend-impl.h

ggml : add backend registry / device interfaces to BLAS backend (llama/9752)

2024-11-01 10:19:05 +02:00

ggml-backend.cpp

Adapt to dynamically loadable backends mechanism (llama/9970)

2024-11-01 10:19:05 +02:00

ggml-blas.cpp

ggml : move more prints to the ggml log system (llama/9839)

2024-11-01 10:19:05 +02:00

ggml-cann.cpp

Adapt to dynamically loadable backends mechanism (llama/9970)

2024-11-01 10:19:05 +02:00

ggml-common.h

ggml-quants : ternary packing for TriLMs and BitNet b1.58 (llama/8151)

2024-09-24 19:45:08 +03:00

ggml-cpu-impl.h

ggml : add ggml-cpu-impl.h (skip) (#0 )

2024-09-24 19:45:08 +03:00

ggml-cuda.cu

CUDA: fix insufficient buffer clearing for MMQ (llama/10032)

2024-11-01 10:19:05 +02:00

ggml-impl.h

fix: use vm_allocate to allocate CPU backend buffer on macOS (llama/9875)

2024-11-01 10:19:05 +02:00

ggml-kompute.cpp

ggml-backend : add device and backend reg interfaces (llama/9707)

2024-10-05 15:23:51 +03:00

ggml-metal.m

metal : support permuted matrix multiplicaions (llama/10033)

2024-11-01 10:19:05 +02:00

ggml-metal.metal

metal : support permuted matrix multiplicaions (llama/10033)

2024-11-01 10:19:05 +02:00

ggml-quants.c

ggml : add run-time detection of neon, i8mm and sve (llama/9331)

2024-10-03 12:22:17 +03:00

ggml-quants.h

ggml : add run-time detection of neon, i8mm and sve (llama/9331)

2024-10-03 12:22:17 +03:00

ggml-rpc.cpp

rpc : pack only RPC structs (llama/9959)

2024-11-01 10:19:05 +02:00

ggml-sycl.cpp

Add SYCL Backend registry, device and Event Interfaces (llama/9705)

2024-11-01 10:19:05 +02:00

ggml-vulkan.cpp

vulkan : add backend registry / device interfaces (llama/9721)

2024-11-01 10:19:05 +02:00

ggml.c

CUDA: fix MMQ for non-contiguous src0, add tests (llama/10021)

2024-11-01 10:19:05 +02:00

sgemm.cpp

whisper : reorganize source code + improve CMake (#2256 )

2024-06-26 19:34:09 +03:00

sgemm.h

whisper : reorganize source code + improve CMake (#2256 )

2024-06-26 19:34:09 +03:00