Georgi Gerganov
603f97ba11
whisper : minor improvemnt in decoding strategy ( #244 )
...
Do not allow for text segments to go beyond end of audio.
This partially mitigates some issues when the last audio window is 1-2
seconds just before the end of the audio file and the decoding spirals
into a repetition of the last transcribed phrase.
2022-12-10 13:38:26 +02:00
Georgi Gerganov
50a061b313
ggml : add alternative cblas_sgemm call
2022-12-08 23:48:04 +02:00
Georgi Gerganov
832b4f34c9
make : indentation + .gitignore
2022-12-08 19:42:06 +02:00
Reinis Muiznieks
0f98755fc5
Flag for Position Independent Code
2022-12-08 19:41:01 +02:00
Georgi Gerganov
56822621a8
twitch.sh : various fixes and polishing
...
- check if streamlink is installed
- fix audio chunking
- change default threads to 4
2022-12-08 19:20:04 +02:00
keyehzy
9e5f3ddc16
Allow for Twitch.tv live transcription
...
We rely on streamlink library to give us a stream, then we proceed similarly to
the radio livestream example.
2022-12-08 19:20:04 +02:00
Kartik Saranathan
d91c001120
Fix paths echoed after the download
...
Was using models path instead of root path
2022-12-08 09:23:52 +02:00
Al Hoang
04a16bbf11
fix compilation on haiku
2022-12-08 09:20:57 +02:00
Georgi Gerganov
47afb93c3c
yt-wsp.sh : improve usage instructions
2022-12-07 22:12:08 +02:00
Georgi Gerganov
575c53dc41
yt-wsp.sh : fix usage instruction + comment
2022-12-07 21:12:55 +02:00
Georgi Gerganov
3996ecc156
Update README.md
2022-12-07 05:15:46 +02:00
Georgi Gerganov
faa85f9840
livestream.sh : remove obsolete comment
2022-12-07 04:41:43 +02:00
Georgi Gerganov
b6597539f9
ggml : fix typo in previous commit
2022-12-06 22:12:57 +02:00
Georgi Gerganov
9a4b7a916e
ggml : use macros to inline FP16 <-> FP32 conversions
2022-12-06 22:09:26 +02:00
Georgi Gerganov
f8ec718b76
ggml : add F16C CPU flag check
2022-12-06 21:56:56 +02:00
katsu560
35b40a93b9
add fp16/fp32 convert intrinsics
2022-12-06 21:44:24 +02:00
Georgi Gerganov
9fe7306f4b
models : add the new "large" model release by OpenAI
...
The old "large" model is now renamed "large-v1".
If you have been using it, make sure to rename it and download the new
"large" model for best results.
2022-12-06 18:48:57 +02:00
Georgi Gerganov
13e8eb2346
bench : add commit hash to bench-all.sh results
2022-12-06 18:47:48 +02:00
Georgi Gerganov
78d13257be
Try to improve the token sampling strategy ( #193 )
...
* whisper : try to improve the token sampling strategy
- Add the "max_initial_timestaamp" token logic from OpenAI
- Disallow sampling timestamps that are in the past
* whisper : fix the max initial timestamp logic + fallback decoding
2022-12-02 21:51:50 +02:00
Georgi Gerganov
9b7df68753
tests : adding transcription tests
2022-12-02 21:40:02 +02:00
Georgi Gerganov
061fc81bd6
ggml : remove inline specifier from fp16 <-> fp32 converters
2022-12-01 22:15:12 +02:00
Georgi Gerganov
57e0e6b700
livestream : handle ffmpeg errors gracefully and stabilize transcript
2022-12-01 20:49:09 +02:00
Georgi Gerganov
4f7363077f
livestream : minor changes
2022-12-01 19:47:58 +02:00
semiformal-net
093c840dee
livestream : fix losing words across audio chunk ( #195 )
...
* improve livestream script
* Update examples/livestream.sh
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Paul Edwards <paul.edwards@semiformal.net>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2022-12-01 19:18:22 +02:00
Tienshiao Ma
e7f09a0a61
Fix Darwin flags - was incorrectly always using the Linux else clause
2022-12-01 19:17:04 +02:00
Georgi Gerganov
4698dcdb52
whisper : add mechanism for aborting the whisper_full() computation
2022-11-27 20:42:45 +02:00
Georgi Gerganov
6fd5358dd0
Update README.md
2022-11-27 11:30:32 +02:00
Georgi Gerganov
164df0d447
whisper.objc : fix context + broken readme links
2022-11-27 10:52:27 +02:00
Georgi Gerganov
e266cb0723
whisper.objc : add real-time processing ( #97 )
...
Similar to the "stream" app
2022-11-26 18:32:46 +02:00
Georgi Gerganov
c207eed431
whisper.objc : fix build warnings
2022-11-26 16:27:04 +02:00
Georgi Gerganov
67e819baf4
minor : remove "examples/" prefix from the README
2022-11-26 13:07:54 +02:00
Georgi Gerganov
a425365b82
yt-wsp.sh : script to easily transcribe VODs
...
Thanks to @DaniruKun
ref: https://gist.github.com/DaniruKun/96f763ec1a037cc92fe1a059b643b818
Usage:
cd whisper.cpp
make
./examples/yt-wsp.sh <video-url>
2022-11-26 12:54:42 +02:00
Georgi Gerganov
e0e864d9ca
Update README.md
2022-11-26 11:56:55 +02:00
Georgi Gerganov
68ecadbbc9
command.wasm : add voice assistant example for the Web ( #171 )
...
Same as the command-line tool "command", but runs in the browser
Also, added helper script "extra/deploy-wasm.sh" and fixed some timing
constants for the WASM examples.
2022-11-26 11:40:06 +02:00
Georgi Gerganov
c536ff4005
minor : add comment for using "generate_karaoke.sh"
2022-11-26 10:22:42 +02:00
Georgi Gerganov
cb70b07db5
livestream.sh : simple tool to transcribe audio livestreams ( #185 )
2022-11-26 10:05:37 +02:00
Georgi Gerganov
3c390ffe38
stream.wasm : add web-based real-time transcription ( #112 )
2022-11-25 23:57:46 +02:00
Georgi Gerganov
be16dfa038
whisper.wasm : do not block page while processing ( close #86 )
2022-11-25 23:07:42 +02:00
Georgi Gerganov
0f619b52ce
main : add stereo-channel-based diarization ( #64 )
...
Not tested - I don't have stereo dialog audio
2022-11-25 22:08:58 +02:00
Georgi Gerganov
1246dd023e
command : add demonstration video
2022-11-25 20:23:58 +02:00
Georgi Gerganov
0be27bbd92
command : fix build + fix README + add bold printing
2022-11-25 19:53:50 +02:00
Georgi Gerganov
bc88eb13c6
examples : add "command" tool ( #171 )
2022-11-25 19:36:57 +02:00
Georgi Gerganov
b8ce25dec1
refactoring : more readable code
2022-11-25 19:28:04 +02:00
vicalloy
fd113687aa
correct model name display on running samples
2022-11-25 07:17:02 +02:00
Georgi Gerganov
e4805d9601
wasm : refactor wasm example + reuse fetch mechanism
2022-11-24 23:13:26 +02:00
Georgi Gerganov
ff36415a86
talk.wasm : update video link + some minor fixes
2022-11-24 20:15:24 +02:00
Georgi Gerganov
025ff465b6
Update README.md
...
Use a less cringy video to demo talk.wasm lol
2022-11-24 20:09:45 +02:00
Georgi Gerganov
2c0501b38a
Update README.md
2022-11-24 20:06:51 +02:00
Georgi Gerganov
abce28ea99
talk.wasm : move to https://whisper.ggerganov.com/talk
...
This way, we can share the same models across different WASM examples
and not have to download them for each page
2022-11-24 18:24:06 +02:00
Georgi Gerganov
a2ecd54455
models : add instructions for using HF fine-tuned models
2022-11-24 17:54:41 +02:00