mirror of
https://github.com/ggerganov/whisper.cpp.git
synced 2024-12-30 17:26:57 +00:00
93935980f8
* metal : init * whisper : factor out graph builds * whisper : allocate encoder and decoder using ggml-alloc * whisper : ggml-alloc is now supported * whisper : CoreML support ggml-alloc * build : fix ggml-alloc * ios : update submodule * extra : update sync-ggml.sh script to also sync ggml-alloc * ci : see if this is causing the crash * whisper : refactor ggml-alloc init * whisper.android : try to fix build * whisper : initial Metal version * ci : try to debug vmem issue * metal : decoder works on GPU! * metal : add multi-decoder support * ggml : fix ggml_nbytes (probably temp solution) * metal : run "cross" step on the GPU * whisper : remove ggml_repeat in the encoder * whisper : offload the Encoder to Metal * ggml : use simpler ggml_bytes() implementation * ggml-alloc : try to make CI happy by reducing vram to 128GB * whisper : add whisper_allocr to wrap ggml_allocr * whisper : factor out alloc init in a function * cmake : update to support Metal build * whisper : add <functional> header * objc : fix build (no Metal yet) * ios : add Metal support * swiftui : fix build * metal : speed-up KQ multiplication * metal : sync latest llama.cpp kernels * readme : add Metal info * ios : update submodule * coreml : add code to toggle Core ML config (CPU, ANE, GPU) * bench : fix timings by running a pre-heat * bench : start benching the decoder * whisper : add ggml_mul_mat_pad * bench : fix uninitialized vars * whisper : add comment for disabling mul-mat padding * whisper : add description of ggml_mul_mat_pad * whisper : clean-up ggml_mul_mat_pad * metal : remove the "concurrent" flag * bench : variable n_past * ios : update SPM package
50 lines
2.2 KiB
Markdown
50 lines
2.2 KiB
Markdown
# whisper.objc
|
|
|
|
Minimal Obj-C application for automatic offline speech recognition.
|
|
The inference runs locally, on-device.
|
|
|
|
https://user-images.githubusercontent.com/1991296/197385372-962a6dea-bca1-4d50-bf96-1d8c27b98c81.mp4
|
|
|
|
Real-time transcription demo:
|
|
|
|
https://user-images.githubusercontent.com/1991296/204126266-ce4177c6-6eca-4bd9-bca8-0e46d9da2364.mp4
|
|
|
|
## Usage
|
|
|
|
```java
|
|
git clone https://github.com/ggerganov/whisper.cpp
|
|
open whisper.cpp/examples/whisper.objc/whisper.objc.xcodeproj/
|
|
|
|
// If you don't want to convert a Core ML model, you can skip this step by create dummy model
|
|
mkdir models/ggml-base.en-encoder.mlmodelc
|
|
```
|
|
|
|
Make sure to build the project in `Release`:
|
|
|
|
<img width="947" alt="image" src="https://user-images.githubusercontent.com/1991296/197382607-9e1e6d1b-79fa-496f-9d16-b71dc1535701.png">
|
|
|
|
Also, don't forget to add the `-DGGML_USE_ACCELERATE` compiler flag for `ggml.c` in Build Phases.
|
|
This can significantly improve the performance of the transcription:
|
|
|
|
<img width="1072" alt="image" src="https://user-images.githubusercontent.com/1991296/208511239-8d7cdbd1-aa48-41b5-becd-ca288d53cc07.png">
|
|
|
|
## Core ML
|
|
|
|
If you want to enable Core ML support, you can add the `-DWHISPER_USE_COREML -DWHISPER_COREML_ALLOW_FALLBACK` compiler flag for `whisper.cpp` in Build Phases:
|
|
|
|
<img width="1072" alt="image" src="https://github.com/ggerganov/whisper.cpp/assets/3001525/103e8f57-6eb6-490d-a60c-f6cf6c319324">
|
|
|
|
Then follow the [`Core ML support` section of readme](../../README.md#core-ml-support) for convert the model.
|
|
|
|
In this project, it also added `-O3 -DNDEBUG` to `Other C Flags`, but adding flags to app proj is not ideal in real world (applies to all C/C++ files), consider splitting xcodeproj in workspace in your own project.
|
|
|
|
## Metal
|
|
|
|
You can also enable Metal to make the inference run on the GPU of your device. This might or might not be more efficient
|
|
compared to Core ML depending on the model and device that you use.
|
|
|
|
To enable Metal, just add `-DGGML_USE_METAL` instead off the `-DWHISPER_USE_COREML` flag and you are ready.
|
|
This will make both the Encoder and the Decoder run on the GPU.
|
|
|
|
If you want to run the Encoder with Core ML and the Decoder with Metal then simply add both `-DWHISPER_USE_COREML -DGGML_USE_METAL` flags. That's all!
|