mirror of https://github.com/ggerganov/whisper.cpp.git synced 2024-12-18 20:27:53 +00:00

History

Georgi Gerganov e30c679928 whisper : reorganize source code + improve CMake (#2256 ) * scripts : update sync [no ci] * files : reorganize [no ci] * sync : llama.cpp * cmake : link math library * cmake : build normal ggml library * files : move headers to include * objc : fix path to ggml-metal.h * ci : fix WHISPER_CUDA -> GGML_CUDA * scripts : sync LICENSE [no ci]		2024-06-26 19:34:09 +03:00
..
.gitignore	talk, talk-llama : pass text_to_speak as a file (#1865 )	2024-02-24 09:24:47 +02:00
CMakeLists.txt	whisper : add integer quantization support (#540 )	2023-04-30 18:51:57 +03:00
eleven-labs.py	talk, talk-llama : pass text_to_speak as a file (#1865 )	2024-02-24 09:24:47 +02:00
gpt-2.cpp	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
gpt-2.h	whisper : add integer quantization support (#540 )	2023-04-30 18:51:57 +03:00
README.md	readme : add Fedora dependencies (#1970 )	2024-03-20 18:42:11 +02:00
speak	talk, talk-llama : pass text_to_speak as a file (#1865 )	2024-02-24 09:24:47 +02:00
speak.bat	`speak` scripts for Windows	2023-06-01 22:45:00 +10:00
speak.ps1	talk, talk-llama : pass text_to_speak as a file (#1865 )	2024-02-24 09:24:47 +02:00
talk.cpp	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00

README.md

talk

Talk with an Artificial Intelligence in your terminal

Demo Talk

Web version: examples/talk.wasm

Building

The talk tool depends on SDL2 library to capture audio from the microphone. You can build it like this:

# Install SDL2
# On Debian based linux distributions:
sudo apt-get install libsdl2-dev

# On Fedora Linux:
sudo dnf install SDL2 SDL2-devel

# Install SDL2 on Mac OS
brew install sdl2

# Build the "talk" executable
make talk

# Run it
./talk -p Santa

GPT-2

To run this, you will need a ggml GPT-2 model: instructions

Alternatively, you can simply download the smallest ggml GPT-2 117M model (240 MB) like this:

wget --quiet --show-progress -O models/ggml-gpt-2-117M.bin https://huggingface.co/ggerganov/ggml/resolve/main/ggml-model-gpt-2-117M.bin

TTS

For best experience, this example needs a TTS tool to convert the generated text responses to voice. You can use any TTS engine that you would like - simply edit the speak script to your needs. By default, it is configured to use MacOS's say or espeak or Windows SpeechSynthesizer, but you can use whatever you wish.