Georgi Gerganov
0f619b52ce
main : add stereo-channel-based diarization ( #64 )
...
Not tested - I don't have stereo dialog audio
2022-11-25 22:08:58 +02:00
Georgi Gerganov
1246dd023e
command : add demonstration video
2022-11-25 20:23:58 +02:00
Georgi Gerganov
0be27bbd92
command : fix build + fix README + add bold printing
2022-11-25 19:53:50 +02:00
Georgi Gerganov
bc88eb13c6
examples : add "command" tool ( #171 )
2022-11-25 19:36:57 +02:00
Georgi Gerganov
b8ce25dec1
refactoring : more readable code
2022-11-25 19:28:04 +02:00
vicalloy
fd113687aa
correct model name display on running samples
2022-11-25 07:17:02 +02:00
Georgi Gerganov
e4805d9601
wasm : refactor wasm example + reuse fetch mechanism
2022-11-24 23:13:26 +02:00
Georgi Gerganov
ff36415a86
talk.wasm : update video link + some minor fixes
2022-11-24 20:15:24 +02:00
Georgi Gerganov
025ff465b6
Update README.md
...
Use a less cringy video to demo talk.wasm lol
2022-11-24 20:09:45 +02:00
Georgi Gerganov
2c0501b38a
Update README.md
2022-11-24 20:06:51 +02:00
Georgi Gerganov
abce28ea99
talk.wasm : move to https://whisper.ggerganov.com/talk
...
This way, we can share the same models across different WASM examples
and not have to download them for each page
2022-11-24 18:24:06 +02:00
Georgi Gerganov
a2ecd54455
models : add instructions for using HF fine-tuned models
2022-11-24 17:54:41 +02:00
Georgi Gerganov
128aaadb93
whisper : improve printfs
2022-11-24 17:54:16 +02:00
Georgi Gerganov
454b91de16
main : fix dangling pointer when using stdin for input ( #65 )
2022-11-24 17:53:51 +02:00
Georgi Gerganov
d7024cf9dc
main, stream : remove --verbose flag ( #178 )
2022-11-24 17:52:04 +02:00
Georgi Gerganov
37422ed733
talk.wasm : add audio pre-processing + bump memory
2022-11-24 00:34:00 +02:00
Georgi Gerganov
be3b720f96
talk.wasm : refactoring + update README.md
2022-11-24 00:08:57 +02:00
Georgi Gerganov
00f46dbc1d
models : add usage comments to the HF convert script ( #157 )
2022-11-23 23:22:40 +02:00
Georgi Gerganov
5698bddbc9
models : fix HF fine-tuned model conversion script ( #157 )
...
It works now
2022-11-23 23:14:11 +02:00
Georgi Gerganov
388e9f79ad
ggml : fix the fix
2022-11-23 22:40:06 +02:00
Georgi Gerganov
35cd29ce1f
ggml : fix cross-compile Linux -> Window with mingw ( #168 )
2022-11-23 22:28:41 +02:00
Georgi Gerganov
a156a358ca
Revert "update README.md"
...
This reverts commit 6a84147113
.
2022-11-23 22:16:50 +02:00
katsu560
6a84147113
update README.md
2022-11-23 22:16:33 +02:00
katsu560
804f36aa2c
ggml: change inline ggml_fp16_to_fp32, ggml_fp16_t ggml_fp32_to_fp16
2022-11-23 22:16:33 +02:00
katsu560
4b2f51b479
add gprof option
2022-11-23 22:16:33 +02:00
katsu560
800ae5b808
fix AVX,AVX2,FMA,F16C detection on Linux and add flags for OpenBLAS
2022-11-23 22:16:33 +02:00
katsu560
83456076f0
add AVX support
2022-11-23 22:16:33 +02:00
Tamotsu Takahashi
3df6c14fca
Build with OpenBLAS and SDL2 on windows
2022-11-23 22:09:54 +02:00
Georgi Gerganov
d64d6ca3fd
models : minor changes to the HF convert script ( #157 )
2022-11-23 22:07:20 +02:00
Georgi Gerganov
93482d0373
models : add "convert-h5-to-ggml.py" script ( #157 )
...
Converts transformers models to ggml.
Although the conversion is successful, it does not work for some reason.
Not sure why
2022-11-23 17:19:22 +02:00
Georgi Gerganov
49706a658a
minor : updates few prints + fix buttons in whisper.wasm
2022-11-23 17:19:21 +02:00
Georgi Gerganov
363a2dadec
Update README.md
2022-11-23 09:53:55 +02:00
Georgi Gerganov
623a486056
Update README.md
2022-11-23 09:52:36 +02:00
Tamotsu Takahashi
2f596f5b33
Find libopenblas.dll.a on windows
...
"lib" is needed for windows.
With this change, you can build whisper.cpp with OpenBLAS's prebuilt DLL.
1. extract a zip from https://github.com/xianyi/OpenBLAS/releases
2. copy the headers in (openblas)/include to the root directory of whisper.cpp
3. invoke cmake with -DCMAKE_LIBRARY_PATH=(openblas)\lib -DWHISPER_SUPPORT_OPENBLAS=ON
4. copy (openblas)/bin/libopenblas.dll to the same directory of whisper.dll after msbuild
https://github.com/ggerganov/whisper.cpp/issues/89#issuecomment-1324391258
2022-11-23 08:26:45 +02:00
Georgi Gerganov
e5dcdabbb8
unicode : fix character replacement (thanks to @tamo)
2022-11-23 08:24:29 +02:00
Georgi Gerganov
dad109c3f1
close #109 : add fetching of the model over HTTP (whisper.wasm)
2022-11-22 22:48:56 +02:00
Georgi Gerganov
326573de9a
talk.wasm : final touches
2022-11-22 22:22:17 +02:00
Georgi Gerganov
9aea96f774
talk.wasm : polishing + adding many AI personalities
2022-11-22 20:10:20 +02:00
Georgi Gerganov
385236d1d3
stream : "-kc" now enables context keeping from previous segment ( #90 )
...
By default, the context keeping is disabled
2022-11-22 18:21:15 +02:00
M. Eren Akbiyik
63ae03b8e0
Prompt previous tokens for streaming ( #163 )
...
* feat: prompt previous tokens for streaming
I used a vector pointer instead of vector itself because it gave weird errors, and why not
* convert vector to use with C api
* feat: remove old refs, check for prompt size
* feat: use better way of getting the pointer
2022-11-22 18:10:35 +02:00
Georgi Gerganov
78116f8eda
talk.wasm : update README.md
2022-11-21 22:42:29 +02:00
Georgi Gerganov
a4dfbeecf9
talk.wasm : GPT-2 meets Whisper in WebAssembly ( #155 )
...
* talk : initial real-time transcription in the browser
* talk : polishing the UI
* talk : ready for beta testing
* talk.wasm : rename example
2022-11-21 22:20:42 +02:00
Georgi Gerganov
2e311a2917
Update README.md
2022-11-21 18:52:20 +02:00
Georgi Gerganov
2065572a11
ggml : fix Windows build
2022-11-20 22:47:03 +02:00
Georgi Gerganov
5c2176e314
ci : add Windows build
2022-11-20 22:47:03 +02:00
Georgi Gerganov
f2df9bd768
stream : add "max_tokens" cli arg
...
Controls the max tokens per segment for the stream example
2022-11-20 21:22:41 +02:00
Georgi Gerganov
fb8d77f760
stream : add "audio_ctx" parameter
...
Used to overwrite the audio context size of the Encoder.
For example, setting "audio_ctx = 512" will make it run about 3 times
faster, processing about 10s of audio, instead of 30s.
The transcription quality drops, but this can be used for real-time
streaming purposes where performance is important.
2022-11-20 21:22:41 +02:00
Georgi Gerganov
62b5ff875c
stream : add "max_tokens" parameter
...
Used to limit the number of tokens in a segment.
Useful to battle with word repetition when using partial encoder context
2022-11-20 21:22:41 +02:00
Georgi Gerganov
d351771a4b
stream : add "single_segment" option
...
Force the entire audio chunk to be transcribed into a single segment
2022-11-20 21:22:41 +02:00
Georgi Gerganov
c058aaf22e
stream : partial encoder experiments
2022-11-20 21:22:41 +02:00