131 Commits

Author SHA1 Message Date
Dmitry Atamanov
7d3da68f79
examples : use miniaudio for direct decoding flac, mp3, ogg and wav (#2759) 2025-02-27 09:06:54 +02:00
Georgi Gerganov
227b5ffa36
make : fix "main" -> "whisper-cli"
Some checks failed
CI / ubuntu-latest-clang (linux/amd64, Debug) (push) Has been cancelled
CI / ubuntu-latest-clang (linux/amd64, Release) (push) Has been cancelled
CI / ubuntu-latest-clang (linux/arm64, Debug) (push) Has been cancelled
CI / ubuntu-latest-clang (linux/arm64, Release) (push) Has been cancelled
CI / ubuntu-latest-clang (linux/ppc64le, Debug) (push) Has been cancelled
CI / ubuntu-latest-clang (linux/ppc64le, Release) (push) Has been cancelled
CI / ubuntu-latest-gcc-sanitized (linux/amd64, ADDRESS) (push) Has been cancelled
CI / ubuntu-latest-gcc-sanitized (linux/amd64, THREAD) (push) Has been cancelled
CI / ubuntu-latest-gcc-sanitized (linux/amd64, UNDEFINED) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled
CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Has been cancelled
CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Has been cancelled
CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Has been cancelled
CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Has been cancelled
CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Has been cancelled
CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Has been cancelled
CI / windows-cublas (x64, Release, ON, 11.8.0, ON, 2.28.5) (push) Has been cancelled
CI / windows-cublas (x64, Release, ON, 12.2.0, ON, 2.28.5) (push) Has been cancelled
CI / emscripten (Release) (push) Has been cancelled
CI / ios-xcode-build (Release) (push) Has been cancelled
CI / android (push) Has been cancelled
CI / quantize (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64,linux/arm64 tag:main]) (push) Has been cancelled
2024-12-31 11:46:17 +02:00
Georgi Gerganov
ed733e85a1
scripts : update to new build system 2024-12-09 11:30:16 +02:00
Georgi Gerganov
384e214cc7 make : shim cmake 2024-12-08 20:14:35 +02:00
Georgi Gerganov
7fd8d9c220 whisper : adapt to new ggml (wip) 2024-11-20 21:00:08 +02:00
Georgi Gerganov
5089ab2d6a whisper : fix build (#0) 2024-11-15 15:21:04 +02:00
Georgi Gerganov
1d5752fa42
make : fix GGML_VULKAN=1 build (#2485) 2024-10-16 18:42:47 +03:00
Georgi Gerganov
396089f3cf whisper : revert mel-related changes (#0)
too much extra logic and complexity for small benefit
2024-10-05 15:23:51 +03:00
Georgi Gerganov
941912467d whisper : adapt to latest ggml (skip) (#0) 2024-10-05 15:23:51 +03:00
Georgi Gerganov
2ef717b293
whisper : add large-v3-turbo (#2440) 2024-10-01 15:57:06 +03:00
Georgi Gerganov
8feb375fbd
tests : remove test-backend-ops (#2434) 2024-09-27 11:49:01 +03:00
Georgi Gerganov
451e9ee92c make : remove "talk" target until updated 2024-09-24 19:45:08 +03:00
Georgi Gerganov
fe18c29ab8 talk-llama : sync llama.cpp 2024-09-24 19:45:08 +03:00
hsinhoyeh
c4e1861d2c
go : add beamsize/entropythold/maxcontext to context interface (#2350)
* feat(go binding): add beamsize/entropythold/maxcontext to context interface

fixes: #2349

* fix go building build

* fix dynamic link .so and header.h

* remove LD_LIBRARY_PATH

* remove ggml obj from whisper dynamic lib

* drop LIB_GGML
2024-08-28 17:09:01 +03:00
Georgi Gerganov
22058f2dbc talk-llama : sync llama.cpp 2024-08-08 22:48:46 +03:00
Georgi Gerganov
5226c3d45c make : remove llama prints [no ci] (#2265) 2024-07-08 14:53:55 +03:00
Georgi Gerganov
bb3dd45524
make : disable CUDA graphs 2024-06-26 23:20:13 +03:00
Georgi Gerganov
9f7f36d4c9
make : disable CUDA mel build 2024-06-26 22:25:25 +03:00
Georgi Gerganov
0a55a70b9b
make : fix missing -O3
same as https://github.com/ggerganov/llama.cpp/pull/8143
2024-06-26 21:21:12 +03:00
Georgi Gerganov
e30c679928
whisper : reorganize source code + improve CMake (#2256)
* scripts : update sync [no ci]

* files : reorganize [no ci]

* sync : llama.cpp

* cmake : link math library

* cmake : build normal ggml library

* files : move headers to include

* objc : fix path to ggml-metal.h

* ci : fix WHISPER_CUDA -> GGML_CUDA

* scripts : sync LICENSE [no ci]
2024-06-26 19:34:09 +03:00
Georgi Gerganov
5181494e9f build : update make / cmake 2024-06-18 09:39:40 +03:00
Georgi Gerganov
3b1ac03828 ggml : remove OpenCL (#0) 2024-06-16 18:19:48 +03:00
Georgi Gerganov
6975600b4b cuda : enable CUDA graphs (#0) 2024-06-16 18:19:48 +03:00
Georgi Gerganov
4942b1b428 cmake : fix CUDA build (#0) 2024-06-16 18:19:48 +03:00
Georgi Gerganov
420b6abc54
cuda : fix HIPBLAS build (#2234) 2024-06-11 19:14:38 +03:00
Borislav Stanimirov
ffef323c4c
whisper : add CUDA-specific computation mel spectrograms (#2206)
* whisper : use polymorphic class to calculate mel spectrogram

* whisper : add cuda-specific mel spectrogram calculation

* whisper : conditionally compile cufftGetErrorString to avoid warnings

* build : add new files to makefile

* ruby : add new files to conf script

* build : fix typo in makefile

* whisper : suppress cub warning for deprecated C++ std in whisper-mel-cuda
2024-06-04 09:32:23 +03:00
Przemysław Pawełczyk
b6680fab50
build : improve disabling AVX-512 (#2129)
* cmake : make WHISPER_NO_AVX512=ON disable all subsets of AVX-512

Previously it happened only for MSVC, but it makes sense to have the
same behavior for other compilers too.

* make : reorder x86 ISA extensions in chronological order

And update compiler flags at the end to ease modifying conditions.

* make : support WHISPER_NO_AVX512=1 for disabling all AVX-512 subsets.

That way you do not have to override each AVX-512 subset setting
individually if it has been turned on during autodetection.
2024-05-08 18:32:43 +03:00
Przemysław Pawełczyk
8fac6455ff
make : change GNU make default CXX from g++ to c++ (#2100) 2024-04-28 22:54:21 +01:00
Didzis Gosko
08d3eef97d
build : fix embedded Metal library generation (#2045) 2024-04-15 20:23:05 +03:00
Didzis Gosko
c7f95b7ca2
build : detect AVX512 in Makefile, add AVX512 option in CMake (#2043)
* make : add AVX512 detection to Makefile and CMakeLists.txt

* make : autodetect more AVX512 instruction subsets

* cmake : do not default to AVX512, must be enabled explicitly

* cmake : enable a set of AVX512 subsets, when AVX512 is turned on

* make : consolidate AVX512 subsets, add AVX512 VBMI

* cmake : revert to NO AVX512 setting, add settings for AVX512 VNNI and VBMI

* make : re-introduce AVX512VNNI back

* cmake : remove superfluous comment line
2024-04-15 20:02:09 +03:00
Przemysław Pawełczyk
1e8f28c42a
build : use pkg-config for OpenBLAS (#1778)
* make : use pkg-config for finding CFLAGS & LDFLAGS needed by OpenBLAS

That way building on *nix like environments (including MSYS2 on Windows)
with WHISPER_OPENBLAS=1 works out of the box.

Fix handling of WHISPER_OPENBLAS, so that empty value or 0 won't be
misinterpreted by make as enabled.  Mind that it's not intended to
detect CMake false constants (OFF NO FALSE N).  make is not CMake.

By default OpenBLAS with 64-bit interface is used, but that can be
changed with `WHISPER_OPENBLAS_INTERFACE64=0` if 32-bit one is desired.

If OpenBLAS headers and library are respectively in include/ and lib/
subdirectories of given path, then you can specify it, e.g.
`OPENBLAS_PATH=/usr/local/openblas`, and this will take precedence over
any pkg-config file.

If there is no pkg-config file (.pc) for OpenBLAS and OPENBLAS_PATH is
empty, then headers are assumed to be in /usr/include/openblas and
library as assumed to be called 'openblas64' (or 'openblas' if
`WHISPER_OPENBLAS_INTERFACE64=0`).  If different headers location should
be used, then it can be done, e.g.
`WHISPER_BLAS_CFLAGS=-I/usr/local/include/openblas`.
If different library should be used, it can be specified, e.g.
`WHISPER_BLAS_LIB=openblasp64` (pthreads version as seen on Fedora), or
you can provide LDFLAGS needed to link with OpenBLAS directly:
`WHISPER_BLAS_LDFLAGS="-L/usr/local/lib/openblas -lopenblas64"`.

Current solution is flexible enough to handle most cases out there
without needlessly hardcoding possible OpenBLAS installation details.

* cmake : fix how pkg-config is used for finding include dirs and libraries needed by OpenBLAS

That way building on *nix like environments (including MSYS2 on Windows)
with -DWHISPER_OPENBLAS=ON should work out of the box as long as you
have CMake 3.25 or newer.

Make OPENBLAS_PATH environment variable supported not only on Windows.
It sets OpenBLAS include dir to ${OPENBLAS_PATH}/include and library to
${WHISPER_BLAS_LIB} (name without prefixes and suffixes) in
${OPENBLAS_PATH}/lib and avoids further package finding.

By default OpenBLAS with 64-bit interface is used (equivalent to setting
`-DWHISPER_BLAS_LIB=openblas64`), but that can be changed with
`-DWHISPER_OPENBLAS_INTERFACE64=OFF` (equivalent to setting
`-DWHISPER_BLAS_LIB=openblas`) if 32-bit one is desired.

Turn on BLA_STATIC for FindBLAS only when WHISPER_STATIC is enabled.
BLA_STATIC may not work as expected for pkg-config based operation.

Get rid of supporting BLAS_HOME environment variable.  If OPENBLAS_PATH
is insufficient in your case, there is no pkg-config file to rely on,
then you can manually specify include dir, e.g.
`-DBLAS_INCLUDE_DIRS=/usr/local/include/openblas`, and library, e.g.
`-DBLAS_LIBRARIES=/usr/local/lib/libopenblas.so`.

* make / cmake : use OpenBLAS with 32-bit interface by default.

OpenBLAS w/o INTERFACE64=1 vel USE_64BITINT=1 seems to be more common.

* cmake : hardcode "lib" prefix for OpenBLAS lib filename (even on Windows)

* cmake : hardcode OpenBLAS library name when building in MSVC (Windows)

Most *nix like environments (including MSYS2 on Windows) have OpenBLAS
packages that allow coexistence of OpenBLAS builds with 32-bit and
64-bit interface (w/o and w/ OPENBLAS_USE64BITINT defined) and they
differ by not having or having "64" suffix in their library filenames.
That's not the case for OpenBLAS prebuilt libraries for Windows.
2024-03-29 15:53:26 +02:00
Georgi Gerganov
9fb308d90f
make : add grammar parser to common objects 2024-03-28 11:59:48 +02:00
Georgi Gerganov
2948c740a2
sync : ggml (#2001)
* sync : update scripts

* sync : ggml

* talk-llama : sync llama.cpp

* make : WHISPER_CUBLAS -> WHISPER_CUDA

* ci : try to fix sycl build

* talk-llama : fix make build
2024-03-27 18:55:10 +02:00
Georgi Gerganov
de4d067f1e
talk-llama : sync llama.cpp 2024-03-15 14:21:59 +02:00
LBlue
276615d708
make : fix CUBLAS link with WSL (#1878) 2024-02-20 12:05:38 +02:00
Georgi Gerganov
65faae0b6a
build : update CBLAS flags + fix unused var warning (#0) 2024-02-19 14:44:46 +02:00
Didzis Gosko
163e74b6c3
metal : option to embed MSL source into compiled binary (#1842)
* ggml : embed Metal library source (ggml-metal.metal) into binary

enable by setting WHISPER_EMBED_METAL_LIBRARY

* rename the build option

* rename the preprocessor directive

* generate Metal library embedding assembly on-fly during build process
2024-02-11 16:41:41 +02:00
Didzis Gosko
0f80e5a80a
whisper : expose CUDA device setting in public API (#1840)
* Makefile : allow to override CUDA_ARCH_FLAG

* whisper : allow to select GPU (CUDA) device from public API
2024-02-09 17:27:47 +02:00
Didzis Gosko
b6559333ff
make : add macOS deployment target option (#1839) 2024-02-09 17:26:29 +02:00
jwijffels
3e6fad07aa
make : update MSYS_NT (#1813)
I just upgraded the R wrapper at https://github.com/bnosac/audio.whisper to use whisper.cpp 1.5.4
I'm working on Windows and noticed while doing that that it did not pick up the relevant CFLAGS/CXXFLAGS as my system showed

```
I whisper.cpp build info: 
I UNAME_S:  MSYS_NT-10.0-19045
I UNAME_P:  unknown
I UNAME_M:  x86_64
```

Many thanks for all the tremendous hard work on maintaining whisper.cpp!
2024-01-30 14:13:49 +02:00
Przemysław Pawełczyk
f5f159c320
server : fix building and simplify lib deps on Windows (#1772)
* make : fix server example building on MSYS2 environments (Windows)

It was not working since commit eff3570f78742dfd56024328ed93d4f442434280
when server was introduced.

* cmake : simplify server example lib deps on Windows

server uses httplib::Server, not httplib::SSLServer, so there is no need
to mention cryptographic libraries in target_link_libraries.
Winsock (ws2_32) suffices here.

Also use plain library names like we use in other places.
2024-01-15 15:48:13 +02:00
Georgi Gerganov
e77b27c331
sync : ggml (VMM, sync-ggml-am, dotprod ARM fixes, CUDA fixes) (#1691)
* scripts : add sync-ggml-am.sh

* sync : ggml (VMM, ARM dot prod fix, etc.)

* build : fix CUDA build

* ggml : fix some mul mat cases + add tests for src1 F16

dbd02958fa
2023-12-29 11:30:47 +02:00
Felix
eff3570f78
server : add a REST Whisper server example with OAI-like API (#1380)
* Add first draft of server

* Added json support and base funcs for server.cpp

* Add more user input via api-request

also some clean up

* Add reqest params and load post function

Also some general clean up

* Remove unused function

* Add readme

* Add exception handlers

* Update examples/server/server.cpp

* make : add server target

* Add magic curl syntax

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-20 21:40:24 +02:00
Georgi Gerganov
bfbaa4dce5
whisper : make large version explicit + fix data size units (#1493) 2023-11-15 19:42:25 +02:00
Evan Jones
3e5c7feeff
whisper : add grammar-based sampling (#1229)
* whisper : add grammar-based sampling

* build : fix after master merge

* command : fix exception when recognizing the command

* whisper : fine-tuning grammar functionality

* command : grammar-related improvements

- option to read grammar from file
- add sample grammars for colors and chess moves
- fine-tune the performance further

* grammars : add assistant + update comments

* command : enable beam-search, add "no_timestamps", add "context", add p

* whisper : remove comment

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-13 10:51:34 +02:00
Georgi Gerganov
b0502836b8
whisper : add full CUDA and Metal offloading (#1472)
* whisper : migrate to ggml-backend

* whisper : fix logit reading

* whisper : fix tensor allocation during load

* whisper : fix beam-search with CUDA

* whisper : free backends + fix compile warning

* whisper : print when CUDA is enabled

* whisper : fix CoreML

* make : clean-up

* talk : fix compile warning

* whisper : support ggml_conv with CUDA and Metal (#1473)

* ggml : add CUDA support for ggml_conv

* whisper : remove ggml_repeat for conv bias + single backend

* cuda : fix im2col kernel

* metal : add im2col support + mul mat-vec f16 x f16

* bench-all : add q4 models

* whisper : clean-up

* quantize-all : fix

* ggml : im2col opts

* whisper : avoid whisper_model_data wrapper

* whisper : add note that ggml_mul_mat_pad does not work with CUDA

* whisper : factor out graph compute in common function

* whisper : fixes

* whisper : fix UB with measure buffers

* whisper : try to fix the parallel whisper_state functionality (#1479)

* whisper : try to fix the parallel whisper_state functionality

* whisper : fix multi-state Metal

* whisper : free backend instances in whisper_state
2023-11-12 15:31:08 +02:00
Georgi Gerganov
2cdfc4e025
whisper : add support for large v3 (#1444)
* whisper : add support for large v3

* bench : fix build + fix go bindings

* bench : fix n_mels

* models : update readme
2023-11-07 15:30:18 +02:00
Georgi Gerganov
f96e1c5b78
sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422)
* sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.)

* metal : allow env metal variable to override resource path (#1415)

* Allow env variable to override resource path

* Update ggml-metal.m

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* sync : restore common / main from `master`

* sync : restore whisper from `master`

* talk-llama : update to latest llama.cpp

* ruby : fix build

* ggml : fix 32-bit ARM build

* ggml : fix MIN / MAX macro collisions + update ios bindings

* ggml : fix ifdefs and MIN / MAX again

* exampels : fix Obj-C and Swift examples

* ggml : fix 32-bit ARM compatibility

* ggml : one more attempt to fix 32-bit ARM compat

* whisper : fix support for larger graphs

---------

Co-authored-by: Chris Raethke <codesoda@users.noreply.github.com>
2023-11-03 21:35:05 +02:00
Georgi Gerganov
b8432f28f4
metal : add F32 support + update bench output 2023-09-15 13:56:08 +03:00
Georgi Gerganov
93935980f8
whisper : Metal and ggml-alloc support (#1270)
* metal : init

* whisper : factor out graph builds

* whisper : allocate encoder and decoder using ggml-alloc

* whisper : ggml-alloc is now supported

* whisper : CoreML support ggml-alloc

* build : fix ggml-alloc

* ios : update submodule

* extra : update sync-ggml.sh script to also sync ggml-alloc

* ci : see if this is causing the crash

* whisper : refactor ggml-alloc init

* whisper.android : try to fix build

* whisper : initial Metal version

* ci : try to debug vmem issue

* metal : decoder works on GPU!

* metal : add multi-decoder support

* ggml : fix ggml_nbytes (probably temp solution)

* metal : run "cross" step on the GPU

* whisper : remove ggml_repeat in the encoder

* whisper : offload the Encoder to Metal

* ggml : use simpler ggml_bytes() implementation

* ggml-alloc : try to make CI happy by reducing vram to 128GB

* whisper : add whisper_allocr to wrap ggml_allocr

* whisper : factor out alloc init in a function

* cmake : update to support Metal build

* whisper : add <functional> header

* objc : fix build (no Metal yet)

* ios : add Metal support

* swiftui : fix build

* metal : speed-up KQ multiplication

* metal : sync latest llama.cpp kernels

* readme : add Metal info

* ios : update submodule

* coreml : add code to toggle Core ML config (CPU, ANE, GPU)

* bench : fix timings by running a pre-heat

* bench : start benching the decoder

* whisper : add ggml_mul_mat_pad

* bench : fix uninitialized vars

* whisper : add comment for disabling mul-mat padding

* whisper : add description of ggml_mul_mat_pad

* whisper : clean-up ggml_mul_mat_pad

* metal : remove the "concurrent" flag

* bench : variable n_past

* ios : update SPM package
2023-09-15 12:18:18 +03:00