Commit Graph

41 Commits

Author SHA1 Message Date
Georgi Gerganov
59a3d0cb57
ggml : sync (ggml-alloc, GPU, eps, etc.) (#1220)
* ggml : sync (ggml-alloc, GPU, eps, etc.)

* ggml : fix build

* wasm : fix build
2023-09-05 13:54:40 +03:00
Georgi Gerganov
429b9785c0
ggml : update WASM SIMD 2023-05-20 20:00:06 +03:00
Georgi Gerganov
a5defbc1b9
release : v1.4.2 2023-05-14 19:06:45 +03:00
Georgi Gerganov
9c61f5f585
release : v1.4.1 2023-04-30 22:57:42 +03:00
Georgi Gerganov
fa8dbdc888
release : v1.4.0 2023-04-30 19:23:37 +03:00
Georgi Gerganov
794b162a46
whisper : add integer quantization support (#540)
* whisper : add integer quantization support

* examples : add common-ggml + prepare to add "quantize" tool

* whisper : quantization tool ready

* whisper : fix F32 support

* whisper : try to fix shared lib linkage

* wasm : update quantized models to Q5

* bench.wasm : remove "medium" button

* bench.wasm : fix custom model button

* ggml : add Q5_0 and Q5_1 WASM SIMD

* wasm : add quantized models to all WASM examples

* wasm : bump DB version number to 2

* talk-llama : update example to latest llama.cpp

* node : increase test timeout to 10s

* readme : add information for model quantization

* wasm : add links to other examples
2023-04-30 18:51:57 +03:00
Georgi Gerganov
1f30b99208
ggml : fix WASM build 2023-04-29 20:21:25 +03:00
Georgi Gerganov
c23588cc4b
release : v1.3.0 2023-04-15 17:30:44 +03:00
Georgi Gerganov
ebef1e8620
ggml : fix WASM build 2023-04-10 23:18:29 +03:00
Georgi Gerganov
ad1389003d
release : v1.2.1 2023-02-28 22:29:12 +02:00
Georgi Gerganov
09d7d2b68e
examples : refactor in order to reuse code and reduce duplication (#482)
* examples : refactor common code into a library

* examples : refactor common SDL code into a library

* make : update Makefile to use common libs

* common : fix MSVC M_PI ..

* addon.node : link common lib
2023-02-15 19:28:10 +02:00
Georgi Gerganov
b2083c5d02
release : v1.2.0 2023-02-04 09:49:49 +02:00
Georgi Gerganov
f3ee4a9673
whisper : reduce memory usage during inference (#431)
* ggml : add "scratch" buffer support

* ggml : support for scratch ring-buffer

* ggml : bug fix in ggml_repeat()

* ggml : error on scratch buffer overflow

* whisper : use scratch buffers during inference (base model only)

* whisper : update memory usage for all models

* whisper : fix encoder memory usage

* whisper : use whisper_context functions instead of macros

* whisper : fix FF + remove it from README

* ggml : reuse ggml_new_i32

* ggml : refactor the scratch buffer storage

* whisper : reorder scratch buffers in the decoder

* main : add option to disable temp fallback

* Update README.md
2023-02-04 09:45:52 +02:00
Georgi Gerganov
60337f5306
wasm : check if navigator.storage.estimate() is available
Safari does not support it
2023-01-25 20:00:59 +02:00
Georgi Gerganov
2c3f50a021
release : v1.1.1 2023-01-23 20:23:44 +02:00
Georgi Gerganov
206fc93396
whisper.wasm : add small and small.en models 2023-01-18 21:58:55 +02:00
Georgi Gerganov
8738427dd6
cmake : bump version to 1.1.0 2023-01-15 14:33:13 +02:00
Georgi Gerganov
fafd78945d
bench.wasm : print system info 2023-01-15 11:34:03 +02:00
Syahmi Azhar
1512545149
whisper : add loader class to allow loading from buffer and others (#353)
* whisper : add loader to allow loading from other than file

* whisper : rename whisper_init to whisper_init_from_file

* whisper : add whisper_init_from_buffer

* android : Delete local.properties

* android : load models directly from assets

* whisper : adding <stddef.h> needed for size_t + code style

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-01-08 13:03:33 +02:00
Georgi Gerganov
44efbf7ff1
cmake : add -Wno-unused-function + update whisper.js 2023-01-07 20:18:34 +02:00
Georgi Gerganov
87dd4a3081
talk.wasm : bump memory usage + update whisper.js 2023-01-06 21:13:44 +02:00
Georgi Gerganov
4a214d2f07
cmake : add CMAKE_RUNTIME_OUTPUT_DIRECTORY
Currently needed by the wasm examples
2023-01-05 21:40:59 +02:00
Georgi Gerganov
1d716d6e34
release : v1.0.4 2022-12-17 19:52:42 +02:00
Georgi Gerganov
930c693989
release : v1.0.3
Fixed whisper.spm tests
2022-12-12 20:36:52 +02:00
Georgi Gerganov
d8a0dde31a
Update README.md 2022-12-12 20:33:09 +02:00
Georgi Gerganov
9e3e6f253a
release : v1.0.2 2022-12-12 20:29:30 +02:00
Georgi Gerganov
57ccd7cc4f
Update README.md 2022-12-12 20:23:10 +02:00
Georgi Gerganov
f309f97df6
Node.js package (#260)
* npm : preparing infra for node package

* npm : package infra ready

* npm : initial version ready

* npm : change name to whisper.cpp

whisper.js is taken
2022-12-12 20:17:27 +02:00
Georgi Gerganov
fcf515de60
bench.wasm : same as "bench" but runs in the browser (#89) 2022-12-11 11:09:10 +02:00
Georgi Gerganov
be16dfa038
whisper.wasm : do not block page while processing (close #86) 2022-11-25 23:07:42 +02:00
Georgi Gerganov
b8ce25dec1
refactoring : more readable code 2022-11-25 19:28:04 +02:00
Georgi Gerganov
abce28ea99
talk.wasm : move to https://whisper.ggerganov.com/talk
This way, we can share the same models across different WASM examples
and not have to download them for each page
2022-11-24 18:24:06 +02:00
Georgi Gerganov
be3b720f96
talk.wasm : refactoring + update README.md 2022-11-24 00:08:57 +02:00
Georgi Gerganov
9aea96f774
talk.wasm : polishing + adding many AI personalities 2022-11-22 20:10:20 +02:00
Georgi Gerganov
a4dfbeecf9
talk.wasm : GPT-2 meets Whisper in WebAssembly (#155)
* talk : initial real-time transcription in the browser

* talk : polishing the UI

* talk : ready for beta testing

* talk.wasm : rename example
2022-11-21 22:20:42 +02:00
Georgi Gerganov
b21213c23e
js : update whipser.js to latest 2022-11-09 19:33:10 +02:00
Georgi Gerganov
69bdb6624a
minor : update whisper.js 2022-10-29 21:28:21 +03:00
Georgi Gerganov
12fb303d9d
whisper.wasm : update system info print 2022-10-29 20:32:41 +03:00
Georgi Gerganov
491ecd7056 wip : polishing WASM example 2022-10-22 18:54:01 +03:00
Georgi Gerganov
db460b78ff wip : WASM 128-bit SIMD support 2022-10-22 18:54:01 +03:00
Georgi Gerganov
e905c6f827 wip : initial WASM port
Works but it is very slow because no SIMD is used.
For example, jfk.wav is processed in ~23 seconds using "tiny.en" model
2022-10-22 18:54:01 +03:00