Georgi Gerganov
|
2948c740a2
|
sync : ggml (#2001)
* sync : update scripts
* sync : ggml
* talk-llama : sync llama.cpp
* make : WHISPER_CUBLAS -> WHISPER_CUDA
* ci : try to fix sycl build
* talk-llama : fix make build
|
2024-03-27 18:55:10 +02:00 |
|
slaren
|
f60ccfd83b
|
update examples and tests
|
2024-03-15 14:01:14 +02:00 |
|
Georgi Gerganov
|
1711bb3881
|
sync : llama.cpp (ggml/0)
|
2024-02-28 13:00:30 +02:00 |
|
Georgi Gerganov
|
578e47e70c
|
sync : llama.cpp (ggml/0)
|
2024-02-25 19:58:46 +02:00 |
|
Georgi Gerganov
|
ce411498f6
|
sync : llama.cpp (ggml/0)
ggml-ci
|
2024-02-22 15:12:36 +02:00 |
|
Georgi Gerganov
|
83afebe872
|
common : add IQ1_S (ggml/0)
ggml-ci
|
2024-02-19 15:53:25 +02:00 |
|
Georgi Gerganov
|
7a74e929c8
|
sync : ggml (#0)
|
2024-01-30 21:30:26 +02:00 |
|
Georgi Gerganov
|
d08445c9ad
|
sync : ggml
|
2024-01-14 10:55:18 +02:00 |
|
Georgi Gerganov
|
32e71a1861
|
sync : ggml
|
2024-01-11 21:54:17 +02:00 |
|
Georgi Gerganov
|
9c857cf280
|
sync : llama.cpp
|
2024-01-11 21:50:01 +02:00 |
|
Georgi Gerganov
|
bebf0da983
|
quantize : add support for K-quant types
|
2023-11-16 16:18:24 +02:00 |
|
Georgi Gerganov
|
5feb0dffba
|
ggml : sync latest ggml lib
|
2023-06-25 14:30:44 +03:00 |
|
Georgi Gerganov
|
e693074aa6
|
ggml : sync latest ggml
- New Q4 and Q5 formats
- Various improvements
|
2023-05-14 18:04:23 +03:00 |
|
Georgi Gerganov
|
c94c469592
|
whisper : fix quantize bug (#842)
* whisper : debug
* whisper : fix bug during quantization
|
2023-04-30 22:50:04 +03:00 |
|
Georgi Gerganov
|
794b162a46
|
whisper : add integer quantization support (#540)
* whisper : add integer quantization support
* examples : add common-ggml + prepare to add "quantize" tool
* whisper : quantization tool ready
* whisper : fix F32 support
* whisper : try to fix shared lib linkage
* wasm : update quantized models to Q5
* bench.wasm : remove "medium" button
* bench.wasm : fix custom model button
* ggml : add Q5_0 and Q5_1 WASM SIMD
* wasm : add quantized models to all WASM examples
* wasm : bump DB version number to 2
* talk-llama : update example to latest llama.cpp
* node : increase test timeout to 10s
* readme : add information for model quantization
* wasm : add links to other examples
|
2023-04-30 18:51:57 +03:00 |
|