whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-04-08 11:54:40 +00:00

Author	SHA1	Message	Date
Dmitry Atamanov	7d3da68f79	examples : use miniaudio for direct decoding flac, mp3, ogg and wav (#2759 )	2025-02-27 09:06:54 +02:00
Georgi Gerganov	227b5ffa36	make : fix "main" -> "whisper-cli" Some checks failed CI / ubuntu-latest-clang (linux/amd64, Debug) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/amd64, Release) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/arm64, Debug) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/arm64, Release) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/ppc64le, Debug) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/ppc64le, Release) (push) Has been cancelled Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, ADDRESS) (push) Has been cancelled Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, THREAD) (push) Has been cancelled Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, UNDEFINED) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Has been cancelled Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Has been cancelled Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Has been cancelled Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Has been cancelled Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Has been cancelled Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Has been cancelled Details CI / windows-cublas (x64, Release, ON, 11.8.0, ON, 2.28.5) (push) Has been cancelled Details CI / windows-cublas (x64, Release, ON, 12.2.0, ON, 2.28.5) (push) Has been cancelled Details CI / emscripten (Release) (push) Has been cancelled Details CI / ios-xcode-build (Release) (push) Has been cancelled Details CI / android (push) Has been cancelled Details CI / quantize (push) Has been cancelled Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64,linux/arm64 tag:main]) (push) Has been cancelled Details	2024-12-31 11:46:17 +02:00
Georgi Gerganov	ed733e85a1	scripts : update to new build system	2024-12-09 11:30:16 +02:00
Georgi Gerganov	384e214cc7	make : shim cmake	2024-12-08 20:14:35 +02:00
Georgi Gerganov	7fd8d9c220	whisper : adapt to new ggml (wip)	2024-11-20 21:00:08 +02:00
Georgi Gerganov	5089ab2d6a	whisper : fix build (#0 )	2024-11-15 15:21:04 +02:00
Georgi Gerganov	1d5752fa42	make : fix GGML_VULKAN=1 build (#2485 )	2024-10-16 18:42:47 +03:00
Georgi Gerganov	396089f3cf	whisper : revert mel-related changes (#0 ) too much extra logic and complexity for small benefit	2024-10-05 15:23:51 +03:00
Georgi Gerganov	941912467d	whisper : adapt to latest ggml (skip) (#0 )	2024-10-05 15:23:51 +03:00
Georgi Gerganov	2ef717b293	whisper : add large-v3-turbo (#2440 )	2024-10-01 15:57:06 +03:00
Georgi Gerganov	8feb375fbd	tests : remove test-backend-ops (#2434 )	2024-09-27 11:49:01 +03:00
Georgi Gerganov	451e9ee92c	make : remove "talk" target until updated	2024-09-24 19:45:08 +03:00
Georgi Gerganov	fe18c29ab8	talk-llama : sync llama.cpp	2024-09-24 19:45:08 +03:00
hsinhoyeh	c4e1861d2c	go : add beamsize/entropythold/maxcontext to context interface (#2350 ) * feat(go binding): add beamsize/entropythold/maxcontext to context interface fixes: #2349 * fix go building build * fix dynamic link .so and header.h * remove LD_LIBRARY_PATH * remove ggml obj from whisper dynamic lib * drop LIB_GGML	2024-08-28 17:09:01 +03:00
Georgi Gerganov	22058f2dbc	talk-llama : sync llama.cpp	2024-08-08 22:48:46 +03:00
Georgi Gerganov	5226c3d45c	make : remove llama prints [no ci] (#2265 )	2024-07-08 14:53:55 +03:00
Georgi Gerganov	bb3dd45524	make : disable CUDA graphs	2024-06-26 23:20:13 +03:00
Georgi Gerganov	9f7f36d4c9	make : disable CUDA mel build	2024-06-26 22:25:25 +03:00
Georgi Gerganov	0a55a70b9b	make : fix missing -O3 same as https://github.com/ggerganov/llama.cpp/pull/8143	2024-06-26 21:21:12 +03:00
Georgi Gerganov	e30c679928	whisper : reorganize source code + improve CMake (#2256 ) * scripts : update sync [no ci] * files : reorganize [no ci] * sync : llama.cpp * cmake : link math library * cmake : build normal ggml library * files : move headers to include * objc : fix path to ggml-metal.h * ci : fix WHISPER_CUDA -> GGML_CUDA * scripts : sync LICENSE [no ci]	2024-06-26 19:34:09 +03:00
Georgi Gerganov	5181494e9f	build : update make / cmake	2024-06-18 09:39:40 +03:00
Georgi Gerganov	3b1ac03828	ggml : remove OpenCL (#0 )	2024-06-16 18:19:48 +03:00
Georgi Gerganov	6975600b4b	cuda : enable CUDA graphs (#0 )	2024-06-16 18:19:48 +03:00
Georgi Gerganov	4942b1b428	cmake : fix CUDA build (#0 )	2024-06-16 18:19:48 +03:00
Georgi Gerganov	420b6abc54	cuda : fix HIPBLAS build (#2234 )	2024-06-11 19:14:38 +03:00
Borislav Stanimirov	ffef323c4c	whisper : add CUDA-specific computation mel spectrograms (#2206 ) * whisper : use polymorphic class to calculate mel spectrogram * whisper : add cuda-specific mel spectrogram calculation * whisper : conditionally compile cufftGetErrorString to avoid warnings * build : add new files to makefile * ruby : add new files to conf script * build : fix typo in makefile * whisper : suppress cub warning for deprecated C++ std in whisper-mel-cuda	2024-06-04 09:32:23 +03:00
Przemysław Pawełczyk	b6680fab50	build : improve disabling AVX-512 (#2129 ) * cmake : make WHISPER_NO_AVX512=ON disable all subsets of AVX-512 Previously it happened only for MSVC, but it makes sense to have the same behavior for other compilers too. * make : reorder x86 ISA extensions in chronological order And update compiler flags at the end to ease modifying conditions. * make : support WHISPER_NO_AVX512=1 for disabling all AVX-512 subsets. That way you do not have to override each AVX-512 subset setting individually if it has been turned on during autodetection.	2024-05-08 18:32:43 +03:00
Przemysław Pawełczyk	8fac6455ff	make : change GNU make default CXX from g++ to c++ (#2100 )	2024-04-28 22:54:21 +01:00
Didzis Gosko	08d3eef97d	build : fix embedded Metal library generation (#2045 )	2024-04-15 20:23:05 +03:00
Didzis Gosko	c7f95b7ca2	build : detect AVX512 in Makefile, add AVX512 option in CMake (#2043 ) * make : add AVX512 detection to Makefile and CMakeLists.txt * make : autodetect more AVX512 instruction subsets * cmake : do not default to AVX512, must be enabled explicitly * cmake : enable a set of AVX512 subsets, when AVX512 is turned on * make : consolidate AVX512 subsets, add AVX512 VBMI * cmake : revert to NO AVX512 setting, add settings for AVX512 VNNI and VBMI * make : re-introduce AVX512VNNI back * cmake : remove superfluous comment line	2024-04-15 20:02:09 +03:00
Przemysław Pawełczyk	1e8f28c42a	build : use pkg-config for OpenBLAS (#1778 ) * make : use pkg-config for finding CFLAGS & LDFLAGS needed by OpenBLAS That way building on nix like environments (including MSYS2 on Windows) with WHISPER_OPENBLAS=1 works out of the box. Fix handling of WHISPER_OPENBLAS, so that empty value or 0 won't be misinterpreted by make as enabled. Mind that it's not intended to detect CMake false constants (OFF NO FALSE N). make is not CMake. By default OpenBLAS with 64-bit interface is used, but that can be changed with `WHISPER_OPENBLAS_INTERFACE64=0` if 32-bit one is desired. If OpenBLAS headers and library are respectively in include/ and lib/ subdirectories of given path, then you can specify it, e.g. `OPENBLAS_PATH=/usr/local/openblas`, and this will take precedence over any pkg-config file. If there is no pkg-config file (.pc) for OpenBLAS and OPENBLAS_PATH is empty, then headers are assumed to be in /usr/include/openblas and library as assumed to be called 'openblas64' (or 'openblas' if `WHISPER_OPENBLAS_INTERFACE64=0`). If different headers location should be used, then it can be done, e.g. `WHISPER_BLAS_CFLAGS=-I/usr/local/include/openblas`. If different library should be used, it can be specified, e.g. `WHISPER_BLAS_LIB=openblasp64` (pthreads version as seen on Fedora), or you can provide LDFLAGS needed to link with OpenBLAS directly: `WHISPER_BLAS_LDFLAGS="-L/usr/local/lib/openblas -lopenblas64"`. Current solution is flexible enough to handle most cases out there without needlessly hardcoding possible OpenBLAS installation details. cmake : fix how pkg-config is used for finding include dirs and libraries needed by OpenBLAS That way building on nix like environments (including MSYS2 on Windows) with -DWHISPER_OPENBLAS=ON should work out of the box as long as you have CMake 3.25 or newer. Make OPENBLAS_PATH environment variable supported not only on Windows. It sets OpenBLAS include dir to ${OPENBLAS_PATH}/include and library to ${WHISPER_BLAS_LIB} (name without prefixes and suffixes) in ${OPENBLAS_PATH}/lib and avoids further package finding. By default OpenBLAS with 64-bit interface is used (equivalent to setting `-DWHISPER_BLAS_LIB=openblas64`), but that can be changed with `-DWHISPER_OPENBLAS_INTERFACE64=OFF` (equivalent to setting `-DWHISPER_BLAS_LIB=openblas`) if 32-bit one is desired. Turn on BLA_STATIC for FindBLAS only when WHISPER_STATIC is enabled. BLA_STATIC may not work as expected for pkg-config based operation. Get rid of supporting BLAS_HOME environment variable. If OPENBLAS_PATH is insufficient in your case, there is no pkg-config file to rely on, then you can manually specify include dir, e.g. `-DBLAS_INCLUDE_DIRS=/usr/local/include/openblas`, and library, e.g. `-DBLAS_LIBRARIES=/usr/local/lib/libopenblas.so`. make / cmake : use OpenBLAS with 32-bit interface by default. OpenBLAS w/o INTERFACE64=1 vel USE_64BITINT=1 seems to be more common. * cmake : hardcode "lib" prefix for OpenBLAS lib filename (even on Windows) * cmake : hardcode OpenBLAS library name when building in MSVC (Windows) Most *nix like environments (including MSYS2 on Windows) have OpenBLAS packages that allow coexistence of OpenBLAS builds with 32-bit and 64-bit interface (w/o and w/ OPENBLAS_USE64BITINT defined) and they differ by not having or having "64" suffix in their library filenames. That's not the case for OpenBLAS prebuilt libraries for Windows.	2024-03-29 15:53:26 +02:00
Georgi Gerganov	9fb308d90f	make : add grammar parser to common objects	2024-03-28 11:59:48 +02:00
Georgi Gerganov	2948c740a2	sync : ggml (#2001 ) * sync : update scripts * sync : ggml * talk-llama : sync llama.cpp * make : WHISPER_CUBLAS -> WHISPER_CUDA * ci : try to fix sycl build * talk-llama : fix make build	2024-03-27 18:55:10 +02:00
Georgi Gerganov	de4d067f1e	talk-llama : sync llama.cpp	2024-03-15 14:21:59 +02:00
LBlue	276615d708	make : fix CUBLAS link with WSL (#1878 )	2024-02-20 12:05:38 +02:00
Georgi Gerganov	65faae0b6a	build : update CBLAS flags + fix unused var warning (#0 )	2024-02-19 14:44:46 +02:00
Didzis Gosko	163e74b6c3	metal : option to embed MSL source into compiled binary (#1842 ) * ggml : embed Metal library source (ggml-metal.metal) into binary enable by setting WHISPER_EMBED_METAL_LIBRARY * rename the build option * rename the preprocessor directive * generate Metal library embedding assembly on-fly during build process	2024-02-11 16:41:41 +02:00
Didzis Gosko	0f80e5a80a	whisper : expose CUDA device setting in public API (#1840 ) * Makefile : allow to override CUDA_ARCH_FLAG * whisper : allow to select GPU (CUDA) device from public API	2024-02-09 17:27:47 +02:00
Didzis Gosko	b6559333ff	make : add macOS deployment target option (#1839 )	2024-02-09 17:26:29 +02:00
jwijffels	3e6fad07aa	make : update MSYS_NT (#1813 ) I just upgraded the R wrapper at https://github.com/bnosac/audio.whisper to use whisper.cpp 1.5.4 I'm working on Windows and noticed while doing that that it did not pick up the relevant CFLAGS/CXXFLAGS as my system showed ``` I whisper.cpp build info: I UNAME_S: MSYS_NT-10.0-19045 I UNAME_P: unknown I UNAME_M: x86_64 ``` Many thanks for all the tremendous hard work on maintaining whisper.cpp!	2024-01-30 14:13:49 +02:00
Przemysław Pawełczyk	f5f159c320	server : fix building and simplify lib deps on Windows (#1772 ) * make : fix server example building on MSYS2 environments (Windows) It was not working since commit eff3570f78742dfd56024328ed93d4f442434280 when server was introduced. * cmake : simplify server example lib deps on Windows server uses httplib::Server, not httplib::SSLServer, so there is no need to mention cryptographic libraries in target_link_libraries. Winsock (ws2_32) suffices here. Also use plain library names like we use in other places.	2024-01-15 15:48:13 +02:00
Georgi Gerganov	e77b27c331	sync : ggml (VMM, sync-ggml-am, dotprod ARM fixes, CUDA fixes) (#1691 ) * scripts : add sync-ggml-am.sh * sync : ggml (VMM, ARM dot prod fix, etc.) * build : fix CUDA build * ggml : fix some mul mat cases + add tests for src1 F16 `dbd02958fa`	2023-12-29 11:30:47 +02:00
Felix	eff3570f78	server : add a REST Whisper server example with OAI-like API (#1380 ) * Add first draft of server * Added json support and base funcs for server.cpp * Add more user input via api-request also some clean up * Add reqest params and load post function Also some general clean up * Remove unused function * Add readme * Add exception handlers * Update examples/server/server.cpp * make : add server target * Add magic curl syntax Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-11-20 21:40:24 +02:00
Georgi Gerganov	bfbaa4dce5	whisper : make large version explicit + fix data size units (#1493 )	2023-11-15 19:42:25 +02:00
Evan Jones	3e5c7feeff	whisper : add grammar-based sampling (#1229 ) * whisper : add grammar-based sampling * build : fix after master merge * command : fix exception when recognizing the command * whisper : fine-tuning grammar functionality * command : grammar-related improvements - option to read grammar from file - add sample grammars for colors and chess moves - fine-tune the performance further * grammars : add assistant + update comments * command : enable beam-search, add "no_timestamps", add "context", add p * whisper : remove comment --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-11-13 10:51:34 +02:00
Georgi Gerganov	b0502836b8	whisper : add full CUDA and Metal offloading (#1472 ) * whisper : migrate to ggml-backend * whisper : fix logit reading * whisper : fix tensor allocation during load * whisper : fix beam-search with CUDA * whisper : free backends + fix compile warning * whisper : print when CUDA is enabled * whisper : fix CoreML * make : clean-up * talk : fix compile warning * whisper : support ggml_conv with CUDA and Metal (#1473) * ggml : add CUDA support for ggml_conv * whisper : remove ggml_repeat for conv bias + single backend * cuda : fix im2col kernel * metal : add im2col support + mul mat-vec f16 x f16 * bench-all : add q4 models * whisper : clean-up * quantize-all : fix * ggml : im2col opts * whisper : avoid whisper_model_data wrapper * whisper : add note that ggml_mul_mat_pad does not work with CUDA * whisper : factor out graph compute in common function * whisper : fixes * whisper : fix UB with measure buffers * whisper : try to fix the parallel whisper_state functionality (#1479) * whisper : try to fix the parallel whisper_state functionality * whisper : fix multi-state Metal * whisper : free backend instances in whisper_state	2023-11-12 15:31:08 +02:00
Georgi Gerganov	2cdfc4e025	whisper : add support for large v3 (#1444 ) * whisper : add support for large v3 * bench : fix build + fix go bindings * bench : fix n_mels * models : update readme	2023-11-07 15:30:18 +02:00
Georgi Gerganov	f96e1c5b78	sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422 ) * sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) * metal : allow env metal variable to override resource path (#1415) * Allow env variable to override resource path * Update ggml-metal.m --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * sync : restore common / main from `master` * sync : restore whisper from `master` * talk-llama : update to latest llama.cpp * ruby : fix build * ggml : fix 32-bit ARM build * ggml : fix MIN / MAX macro collisions + update ios bindings * ggml : fix ifdefs and MIN / MAX again * exampels : fix Obj-C and Swift examples * ggml : fix 32-bit ARM compatibility * ggml : one more attempt to fix 32-bit ARM compat * whisper : fix support for larger graphs --------- Co-authored-by: Chris Raethke <codesoda@users.noreply.github.com>	2023-11-03 21:35:05 +02:00
Georgi Gerganov	b8432f28f4	metal : add F32 support + update bench output	2023-09-15 13:56:08 +03:00
Georgi Gerganov	93935980f8	whisper : Metal and ggml-alloc support (#1270 ) * metal : init * whisper : factor out graph builds * whisper : allocate encoder and decoder using ggml-alloc * whisper : ggml-alloc is now supported * whisper : CoreML support ggml-alloc * build : fix ggml-alloc * ios : update submodule * extra : update sync-ggml.sh script to also sync ggml-alloc * ci : see if this is causing the crash * whisper : refactor ggml-alloc init * whisper.android : try to fix build * whisper : initial Metal version * ci : try to debug vmem issue * metal : decoder works on GPU! * metal : add multi-decoder support * ggml : fix ggml_nbytes (probably temp solution) * metal : run "cross" step on the GPU * whisper : remove ggml_repeat in the encoder * whisper : offload the Encoder to Metal * ggml : use simpler ggml_bytes() implementation * ggml-alloc : try to make CI happy by reducing vram to 128GB * whisper : add whisper_allocr to wrap ggml_allocr * whisper : factor out alloc init in a function * cmake : update to support Metal build * whisper : add <functional> header * objc : fix build (no Metal yet) * ios : add Metal support * swiftui : fix build * metal : speed-up KQ multiplication * metal : sync latest llama.cpp kernels * readme : add Metal info * ios : update submodule * coreml : add code to toggle Core ML config (CPU, ANE, GPU) * bench : fix timings by running a pre-heat * bench : start benching the decoder * whisper : add ggml_mul_mat_pad * bench : fix uninitialized vars * whisper : add comment for disabling mul-mat padding * whisper : add description of ggml_mul_mat_pad * whisper : clean-up ggml_mul_mat_pad * metal : remove the "concurrent" flag * bench : variable n_past * ios : update SPM package	2023-09-15 12:18:18 +03:00

1 2 3

131 Commits