From 23277d21ce80cd19f41a4b275ba337962fb63fec Mon Sep 17 00:00:00 2001 From: Georgi Gerganov Date: Wed, 13 Sep 2023 20:54:03 +0300 Subject: [PATCH] readme : add Metal info --- README.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 5f180604..3707b933 100644 --- a/README.md +++ b/README.md @@ -11,14 +11,14 @@ Beta: [v1.4.2](https://github.com/ggerganov/whisper.cpp/releases/tag/v1.4.2) / S High-performance inference of [OpenAI's Whisper](https://github.com/openai/whisper) automatic speech recognition (ASR) model: - Plain C/C++ implementation without dependencies -- Apple silicon first-class citizen - optimized via ARM NEON, Accelerate framework and [Core ML](https://github.com/ggerganov/whisper.cpp#core-ml-support) +- Apple Silicon first-class citizen - optimized via ARM NEON, Accelerate framework, Metal and [Core ML](https://github.com/ggerganov/whisper.cpp#core-ml-support) - AVX intrinsics support for x86 architectures - VSX intrinsics support for POWER architectures - Mixed F16 / F32 precision - [4-bit and 5-bit integer quantization support](https://github.com/ggerganov/whisper.cpp#quantization) - Low memory usage (Flash Attention) - Zero memory allocations at runtime -- Runs on the CPU +- Support for CPU-only inference - [Partial GPU support for NVIDIA via cuBLAS](https://github.com/ggerganov/whisper.cpp#nvidia-gpu-support-via-cublas) - [Partial OpenCL GPU support via CLBlast](https://github.com/ggerganov/whisper.cpp#opencl-gpu-support-via-clblast) - [BLAS CPU support via OpenBLAS](https://github.com/ggerganov/whisper.cpp#blas-cpu-support-via-openblas) @@ -50,6 +50,10 @@ You can also easily make your own offline voice assistant application: [command] https://user-images.githubusercontent.com/1991296/204038393-2f846eae-c255-4099-a76d-5735c25c49da.mp4 +On Apply Silicon, the inference runs fully on the GPU via Metal: + +https://github.com/ggerganov/whisper.cpp/assets/1991296/c82e8f86-60dc-49f2-b048-d2fdbd6b5225 + Or you can even run it straight in the browser: [talk.wasm](examples/talk.wasm) ## Implementation details