mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-06-17 22:38:07 +00:00

Files

Georgi Gerganov fb8d77f760 stream : add "audio_ctx" parameter

Used to overwrite the audio context size of the Encoder.
For example, setting "audio_ctx = 512" will make it run about 3 times
faster, processing about 10s of audio, instead of 30s.

The transcription quality drops, but this can be used for real-time
streaming purposes where performance is important.

2022-11-20 21:22:41 +02:00

CMakeLists.txt

refactoring : move main + stream in examples + other stuff

2022-10-25 20:53:48 +03:00

README.md

Update README.md

2022-10-25 20:53:48 +03:00

stream.cpp

stream : add "audio_ctx" parameter

2022-11-20 21:22:41 +02:00

README.md

stream

This is a naive example of performing real-time inference on audio from your microphone. The stream tool samples the audio every half a second and runs the transcription continously. More info is available in issue #10.

./stream -m ./models/ggml-base.en.bin -t 8 --step 500 --length 5000

https://user-images.githubusercontent.com/1991296/194935793-76afede7-cfa8-48d8-a80f-28ba83be7d09.mp4

The stream tool depends on SDL2 library to capture audio from the microphone. You can build it like this:

# Install SDL2 on Linux
sudo apt-get install libsdl2-dev

# Install SDL2 on Mac OS
brew install sdl2

make stream