Commit Graph

775 Commits

Author SHA1 Message Date
78d13257be Try to improve the token sampling strategy (#193)
* whisper : try to improve the token sampling strategy

- Add the "max_initial_timestaamp" token logic from OpenAI
- Disallow sampling timestamps that are in the past

* whisper : fix the max initial timestamp logic + fallback decoding
2022-12-02 21:51:50 +02:00
9b7df68753 tests : adding transcription tests 2022-12-02 21:40:02 +02:00
061fc81bd6 ggml : remove inline specifier from fp16 <-> fp32 converters 2022-12-01 22:15:12 +02:00
57e0e6b700 livestream : handle ffmpeg errors gracefully and stabilize transcript 2022-12-01 20:49:09 +02:00
4f7363077f livestream : minor changes 2022-12-01 19:47:58 +02:00
093c840dee livestream : fix losing words across audio chunk (#195)
* improve livestream script

* Update examples/livestream.sh

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Co-authored-by: Paul Edwards <paul.edwards@semiformal.net>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2022-12-01 19:18:22 +02:00
e7f09a0a61 Fix Darwin flags - was incorrectly always using the Linux else clause 2022-12-01 19:17:04 +02:00
4698dcdb52 whisper : add mechanism for aborting the whisper_full() computation 2022-11-27 20:42:45 +02:00
6fd5358dd0 Update README.md 2022-11-27 11:30:32 +02:00
164df0d447 whisper.objc : fix context + broken readme links 2022-11-27 10:52:27 +02:00
e266cb0723 whisper.objc : add real-time processing (#97)
Similar to the "stream" app
2022-11-26 18:32:46 +02:00
c207eed431 whisper.objc : fix build warnings 2022-11-26 16:27:04 +02:00
67e819baf4 minor : remove "examples/" prefix from the README 2022-11-26 13:07:54 +02:00
a425365b82 yt-wsp.sh : script to easily transcribe VODs
Thanks to @DaniruKun
ref: https://gist.github.com/DaniruKun/96f763ec1a037cc92fe1a059b643b818

Usage:

  cd whisper.cpp
  make

  ./examples/yt-wsp.sh <video-url>
2022-11-26 12:54:42 +02:00
e0e864d9ca Update README.md 2022-11-26 11:56:55 +02:00
68ecadbbc9 command.wasm : add voice assistant example for the Web (#171)
Same as the command-line tool "command", but runs in the browser

Also, added helper script "extra/deploy-wasm.sh" and fixed some timing
constants for the WASM examples.
2022-11-26 11:40:06 +02:00
c536ff4005 minor : add comment for using "generate_karaoke.sh" 2022-11-26 10:22:42 +02:00
cb70b07db5 livestream.sh : simple tool to transcribe audio livestreams (#185) 2022-11-26 10:05:37 +02:00
3c390ffe38 stream.wasm : add web-based real-time transcription (#112) 2022-11-25 23:57:46 +02:00
be16dfa038 whisper.wasm : do not block page while processing (close #86) 2022-11-25 23:07:42 +02:00
0f619b52ce main : add stereo-channel-based diarization (#64)
Not tested - I don't have stereo dialog audio
2022-11-25 22:08:58 +02:00
1246dd023e command : add demonstration video 2022-11-25 20:23:58 +02:00
0be27bbd92 command : fix build + fix README + add bold printing 2022-11-25 19:53:50 +02:00
bc88eb13c6 examples : add "command" tool (#171) 2022-11-25 19:36:57 +02:00
b8ce25dec1 refactoring : more readable code 2022-11-25 19:28:04 +02:00
fd113687aa correct model name display on running samples 2022-11-25 07:17:02 +02:00
e4805d9601 wasm : refactor wasm example + reuse fetch mechanism 2022-11-24 23:13:26 +02:00
ff36415a86 talk.wasm : update video link + some minor fixes 2022-11-24 20:15:24 +02:00
025ff465b6 Update README.md
Use a less cringy video to demo talk.wasm lol
2022-11-24 20:09:45 +02:00
2c0501b38a Update README.md 2022-11-24 20:06:51 +02:00
abce28ea99 talk.wasm : move to https://whisper.ggerganov.com/talk
This way, we can share the same models across different WASM examples
and not have to download them for each page
2022-11-24 18:24:06 +02:00
a2ecd54455 models : add instructions for using HF fine-tuned models 2022-11-24 17:54:41 +02:00
128aaadb93 whisper : improve printfs 2022-11-24 17:54:16 +02:00
454b91de16 main : fix dangling pointer when using stdin for input (#65) 2022-11-24 17:53:51 +02:00
d7024cf9dc main, stream : remove --verbose flag (#178) 2022-11-24 17:52:04 +02:00
37422ed733 talk.wasm : add audio pre-processing + bump memory 2022-11-24 00:34:00 +02:00
be3b720f96 talk.wasm : refactoring + update README.md 2022-11-24 00:08:57 +02:00
00f46dbc1d models : add usage comments to the HF convert script (#157) 2022-11-23 23:22:40 +02:00
5698bddbc9 models : fix HF fine-tuned model conversion script (#157)
It works now
2022-11-23 23:14:11 +02:00
388e9f79ad ggml : fix the fix 2022-11-23 22:40:06 +02:00
35cd29ce1f ggml : fix cross-compile Linux -> Window with mingw (#168) 2022-11-23 22:28:41 +02:00
a156a358ca Revert "update README.md"
This reverts commit 6a84147113.
2022-11-23 22:16:50 +02:00
6a84147113 update README.md 2022-11-23 22:16:33 +02:00
804f36aa2c ggml: change inline ggml_fp16_to_fp32, ggml_fp16_t ggml_fp32_to_fp16 2022-11-23 22:16:33 +02:00
4b2f51b479 add gprof option 2022-11-23 22:16:33 +02:00
800ae5b808 fix AVX,AVX2,FMA,F16C detection on Linux and add flags for OpenBLAS 2022-11-23 22:16:33 +02:00
83456076f0 add AVX support 2022-11-23 22:16:33 +02:00
3df6c14fca Build with OpenBLAS and SDL2 on windows 2022-11-23 22:09:54 +02:00
d64d6ca3fd models : minor changes to the HF convert script (#157) 2022-11-23 22:07:20 +02:00
93482d0373 models : add "convert-h5-to-ggml.py" script (#157)
Converts transformers models to ggml.
Although the conversion is successful, it does not work for some reason.
Not sure why
2022-11-23 17:19:22 +02:00