whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-04-18 08:10:39 +00:00

Author	SHA1	Message	Date
Fujimoto Seiji	e6234cd435	whisper : fix "bench-all outputs an invalid result on larger models" (#3002 ) The benchmark script 'scripts/bench-all.sh' assumes that the 11th field of the output line is a timestamp. This assumption does not hold when the target model takes a bit longer to process. Fix this issue by introducing an explicit whitespace to the output lines of `whisper_print_timings()`. Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>	2025-04-04 18:36:19 +03:00
Georgi Gerganov	2b6d0d2200	rename : ggerganov -> ggml-org (#3005 )	2025-04-04 16:11:52 +03:00
Daniel Bevenius	0b17d4507e	examples : update server.py to match github pages app [no ci] (#3004 ) This commit updates examples/server.py which is used to serve the wasm examples locally. The changes include: - Added a redirect from the root URL to /whisper.cpp. So now accessing http://localhost:8000/ will redirect to http://localhost:8000/whisper.cpp/ which matches the url for the app deployed to github pages. - Custom handling for coi-serviceworker.js to serve it to avoid and error in the console. This file is not strictly necessary for the local server to work as the headers are provided already but it is nice to not have an error in the console. - Fixed the shutdown of the server to ensure it exits cleanly on Ctrl+C. Previously it would continue to hang onto the port even after the processed had exited.	2025-04-04 10:23:53 +02:00
Daniel Bevenius	77e0c86ab6	whisper.wasm : fix unknown language issue (#3000 ) Some checks are pending CI / ubuntu-22-clang (linux/ppc64le, Debug) (push) Waiting to run Details CI / ubuntu-22-clang (linux/ppc64le, Release) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, ADDRESS) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, THREAD) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, UNDEFINED) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Waiting to run Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Waiting to run Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Waiting to run Details CI / windows-cublas (x64, Release, ON, 11.8.0, ON, 2.28.5) (push) Waiting to run Details CI / windows-cublas (x64, Release, ON, 12.2.0, ON, 2.28.5) (push) Waiting to run Details CI / emscripten (Release) (push) Waiting to run Details CI / ios-xcode-build (Release) (push) Blocked by required conditions Details CI / android (push) Waiting to run Details CI / android_java (push) Waiting to run Details CI / quantize (push) Waiting to run Details CI / release (push) Blocked by required conditions Details CI / coreml-base-en (push) Blocked by required conditions Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64 tag:main]) (push) Waiting to run Details Examples WASM / deploy-wasm-github-pages (push) Waiting to run Details * whisper.wasm : fix unknown language issue This commit addresses an issue with whisper.wasm where the following error was being displayed when running the application in github pages: ``` whisper_lang_id: unknown language 'д=␙c' ``` This turned out to be a memory corruption issue and further details can be found in the reference issue below. Refs: https://github.com/ggerganov/whisper.cpp/issues/2998	2025-04-03 19:50:47 +02:00
Georgi Gerganov	eac1bc9c47	examples : add new sources Some checks failed CI / ubuntu-22-clang (linux/ppc64le, Release) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, ADDRESS) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, THREAD) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, UNDEFINED) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Waiting to run Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Waiting to run Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Waiting to run Details CI / windows-cublas (x64, Release, ON, 11.8.0, ON, 2.28.5) (push) Waiting to run Details CI / windows-cublas (x64, Release, ON, 12.2.0, ON, 2.28.5) (push) Waiting to run Details CI / emscripten (Release) (push) Waiting to run Details CI / ios-xcode-build (Release) (push) Blocked by required conditions Details CI / android (push) Waiting to run Details CI / android_java (push) Waiting to run Details CI / quantize (push) Waiting to run Details CI / release (push) Blocked by required conditions Details CI / coreml-base-en (push) Blocked by required conditions Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64 tag:main]) (push) Waiting to run Details Examples WASM / deploy-wasm-github-pages (push) Waiting to run Details Bindings Tests (Ruby) / ubuntu-22 (push) Has been cancelled Details ggml-ci	2025-04-03 10:30:16 +03:00
Georgi Gerganov	cbde66d913	sync : ggml	2025-04-03 10:30:16 +03:00
cmdr2	513ecf8dc0	cpu: move all the operators into a separate c++ file (except mul_mat) (ggml/1167) * cpu: refactor SIMD mappings and vectorized op functions into separate files * Fix warning for ggml_float to float * Fix warnings * cpu: move all the operations (except mul_mat) to a separate c++ file * fix whitespace * Update ggml/src/ggml-cpu/vec.h Co-authored-by: Diego Devesa <slarengh@gmail.com> * Fix PR comments - use GGML_UNUSED, use cassert in ops.cpp * Reverse the order of import for ops.h and vec.h, to match what was present in ggml-cpu.c previously --------- Co-authored-by: Diego Devesa <slarengh@gmail.com>	2025-04-03 10:30:16 +03:00
Daniel Bevenius	cce5daf17b	docs : add xcframework section to README.md [no ci] (#2997 ) This adds a section to the README.md file that describes how to use the XCFramework. The modification for this is that is not obvious how to use the XCFramework and and example will help. One thing to note is that the example is using the latest release including the checksum. We are thinking about how we might automate this in the future but for now this is a good start.	2025-04-03 09:06:53 +02:00
Georgi Gerganov	2c502b3c00	readme : update roadmap link Some checks are pending CI / ubuntu-22-clang (linux/ppc64le, Debug) (push) Waiting to run Details CI / ubuntu-22-clang (linux/ppc64le, Release) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, ADDRESS) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, THREAD) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, UNDEFINED) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Waiting to run Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Waiting to run Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Waiting to run Details CI / windows-cublas (x64, Release, ON, 11.8.0, ON, 2.28.5) (push) Waiting to run Details CI / windows-cublas (x64, Release, ON, 12.2.0, ON, 2.28.5) (push) Waiting to run Details CI / emscripten (Release) (push) Waiting to run Details CI / ios-xcode-build (Release) (push) Blocked by required conditions Details CI / android (push) Waiting to run Details CI / android_java (push) Waiting to run Details CI / quantize (push) Waiting to run Details CI / release (push) Blocked by required conditions Details CI / coreml-base-en (push) Blocked by required conditions Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64 tag:main]) (push) Waiting to run Details Examples WASM / deploy-wasm-github-pages (push) Waiting to run Details	2025-04-02 17:38:35 +03:00
Georgi Gerganov	51c6961c7b	release : v1.7.5 v1.7.5	2025-04-02 16:39:48 +03:00
Georgi Gerganov	503a786c9a	bench : update numbers [no ci] (#2993 )	2025-04-02 16:27:36 +03:00
Georgi Gerganov	ad4e350933	sync : ggml ggml-ci	2025-04-02 15:51:57 +03:00
Chenguang Li	d7a9346ab1	get_rows and dup optimization (llama/12671) * [CANN]get_rows and dup optimization. Co-authored-by: hipudding <huafengchun@gmail.com> Signed-off-by: noemotiovon <noemotiovon@gmail.com> * [CANN]GET_ROWS and CPY/DUP optimization Co-authored-by: hipudding <huafengchun@gmail.com> Signed-off-by: noemotiovon <noemotiovon@gmail.com> * [CANN]code style adjustment Signed-off-by: noemotiovon <noemotiovon@gmail.com> * [CANN]code style adjustment Signed-off-by: noemotiovon <noemotiovon@gmail.com> * [CANN]code style adjustment Signed-off-by: noemotiovon <noemotiovon@gmail.com> * [CANN]code style adjustment Signed-off-by: noemotiovon <noemotiovon@gmail.com> --------- Signed-off-by: noemotiovon <noemotiovon@gmail.com> Co-authored-by: noemotiovon <noemotiovon@gmail.com> Co-authored-by: hipudding <huafengchun@gmail.com>	2025-04-02 15:51:57 +03:00
Junil Kim	b63d23f728	opencl : fix memory allocation size (llama/12649) issue: https://github.com/CodeLinaro/llama.cpp/pull/17#issuecomment-2760611283 This patch fixes the memory allocation size not exceeding the maximum size of the OpenCL device.	2025-04-02 15:51:57 +03:00
Georgi Gerganov	f6ce10e4a1	metal : use F32 prec in FA kernels (llama/12688) * metal : use F32 prec in FA kernels ggml-ci * cont : fix FA vec kernel ggml-ci	2025-04-02 15:51:57 +03:00
R0CKSTAR	6cb2b86581	Fix clang warning in gguf_check_reserved_keys (llama/12686) * Fix clang warning in gguf_check_reserved_keys Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Fix typo Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-04-02 15:51:57 +03:00
Wagner Bruna	801d6bd809	vulkan: fix build when glslc doesn't support coopmat (llama/12683)	2025-04-02 15:51:57 +03:00
Romain Biessy	ddf7e6a15d	SYCL: Rename oneMKL to oneMath (llama/12192) * Rename oneMKL Interface to oneMath * Use oneMath for Intel vendor * Rename occurences to mkl * clang-format * Silence verbose warnings * Set oneMath HIP_TARGETS * Fix silence warnings * Remove step to build oneMath from build instructions * Use fixed oneMath version * Remove INTEL_CPU * Fold CMake oneDNN conditions * Use Intel oneMKL for Intel devices * Improve CMake message * Link against MKL::MKL_SYCL::BLAS only * Move oneMath documentation to Nvidia and AMD sections	2025-04-02 15:51:57 +03:00
Akarshan Biswas	0d42097fd3	SYCL: switch to SYCL namespace (llama/12674)	2025-04-02 15:51:57 +03:00
a3sh	842b9c984c	ggml : faster ssm scan (llama/10558) * faster ssm_scan * delete unused commnet * clang format * add space * modify unnecessary calculations * faster ssm conv implementatioin * modify file name with dash	2025-04-02 15:51:57 +03:00
0cc4m	0810f02547	Vulkan: Add DP4A MMQ and Q8_1 quantization shader (llama/12135) * Vulkan: Add DP4A MMQ and Q8_1 quantization shader * Add q4_0 x q8_1 matrix matrix multiplication support * Vulkan: Add int8 coopmat MMQ support * Vulkan: Add q4_1, q5_0 and q5_1 quants, improve integer dot code * Add GL_EXT_integer_dot_product check * Remove ggml changes, fix mmq pipeline picker * Remove ggml changes, restore Intel coopmat behaviour * Fix glsl compile attempt when integer vec dot is not supported * Remove redundant code, use non-saturating integer dot, enable all matmul sizes for mmq * Remove redundant comment * Fix integer dot check * Fix compile issue with unsupported int dot glslc * Update Windows build Vulkan SDK version	2025-04-02 15:51:57 +03:00
Georgi Gerganov	8c13c78f9d	cmake : fix whitespace (llama/0)	2025-04-02 15:51:57 +03:00
Daniel Bevenius	f31b404fcb	tests : remove gh label test-whisper-cli-tiny-en (#2988 ) Some checks are pending CI / ubuntu-22-clang (linux/ppc64le, Debug) (push) Waiting to run Details CI / ubuntu-22-clang (linux/ppc64le, Release) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, ADDRESS) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, THREAD) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, UNDEFINED) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Waiting to run Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Waiting to run Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Waiting to run Details CI / windows-cublas (x64, Release, ON, 11.8.0, ON, 2.28.5) (push) Waiting to run Details CI / windows-cublas (x64, Release, ON, 12.2.0, ON, 2.28.5) (push) Waiting to run Details CI / emscripten (Release) (push) Waiting to run Details CI / ios-xcode-build (Release) (push) Blocked by required conditions Details CI / android (push) Waiting to run Details CI / android_java (push) Waiting to run Details CI / quantize (push) Waiting to run Details CI / release (push) Blocked by required conditions Details CI / coreml-base-en (push) Blocked by required conditions Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64 tag:main]) (push) Waiting to run Details Examples WASM / deploy-wasm-github-pages (push) Waiting to run Details This commit removes test-whisper-cli-tiny-en from the gh label. The motivation for this change is that until recently the tests were disabled. But now that they are enabled some of the tests, specifically the ci jobs that use sanatizers (e.g. thread-sanitizer) take a long time to run as they are instrumented. Some of these jobs also have matricies which means that there are multiple jobs are created that all run these tests. The suggestion here is to limit the number of tests that are run in the ci jobs so cut down the CI build time.	2025-04-02 10:50:31 +02:00
Daniel Bevenius	854c0518bc	examples : clarify Core ML encoder model usage [no ci] (#2987 ) This commit clarifies the usage of the Core ML encoder model in the whisper.obj and whisper.swiftui examples. Refs: https://github.com/ggerganov/whisper.cpp/issues/2783	2025-04-02 08:32:14 +02:00
Daniel Bevenius	c8e3968edd	ci : remove intermediate build on push to master (#2986 ) This commit removes the builds that happen on each push to master. Refs: https://github.com/ggerganov/whisper.cpp/discussions/2983#discussioncomment-12691424	2025-04-02 08:29:28 +02:00
Daniel Bevenius	b358de2458	whisper.objc : fix typo in README.md [no ci] (#2985 ) This commit fixes a typo in the README.md file of the whisper.objc example. Resolves: https://github.com/ggerganov/whisper.cpp/issues/2984	2025-04-02 08:26:57 +02:00
Daniel Bevenius	11688b262f	coreml: fix Whisper to CoreML conversion by disabling SDPA [no ci] (#2979 ) * coreml: fix Whisper to CoreML conversion by disabling SDPA This commit disables the use of PyTorch's `scaled_dot_product_attention` in the Whisper model to avoid compatibility issues during CoreML conversion. The issue occurs because coremltools requires PyTorch 2.5.0, but the Whisper implementation may expect behavior from newer PyTorch versions. By setting `MultiHeadAttention.use_sdpa = False`, we force Whisper to use its fallback manual attention implementation, which works correctly with PyTorch 2.5.0 during the tracing process. Refs: https://github.com/ggerganov/whisper.cpp/issues/2783 * coreml: fix audio shape in whisper decoder conversion This commit fixes the audio shape in the whisper decoder conversion script. The motivation for this is that the audio shape was incorrect and was causing the conversion to fail. * coreml : set -e in generate-coreml-interface.sh The commit sets the -e flag in the generate-coreml-interface.sh script to make sure the script fails if any command fails. * coreml : update generated encoder/decoder interfaces This commit updates the generated encoder/decoder interfaces for the whisper model which is the result of running the generate-coreml-interface.sh script.	2025-04-01 18:01:23 +02:00
Daniel Bevenius	04b9508fb3	ci : add coreml job that converts base.en to coreml [no ci] (#2981 ) * ci : add coreml job that converts base.en to coreml [no ci] This commit adds a new job to the CI pipeline that downloads the base.en model and converts it to CoreML format. The CoreML model is then packed into a zip file and uploaded as an artifact. This will only be done for pushes to master, releases, or pre-releases. Refs: https://github.com/ggerganov/whisper.cpp/issues/2783 * coreml : remove publishing of coreml model * ci : add GGML_OPENMP=OFF to ubuntu-22-gcc-sanitized	2025-04-01 17:04:32 +02:00
Daniel Bevenius	4200430e75	tests : re-enable tests [no ci] (#2977 ) This commit re-enables the tests in the build process which are currently commented out. It is possible to build the tests using `-DWHISPER_BUILD_TESTS=ON` and then run a single test using: ```console $ ctest -R test-whisper-cli-tiny.en --test-dir build Internal ctest changing into directory: /home/danbev/work/ai/whisper-work/build Test project /home/danbev/work/ai/whisper-work/build Start 2: test-whisper-cli-tiny.en 1/1 Test #2: test-whisper-cli-tiny.en ......... Passed 4.44 sec 100% tests passed, 0 tests failed out of 1 Label Time Summary: en = 4.44 secproc (1 test) gh = 4.44 secproc (1 test) tiny = 4.44 sec*proc (1 test) Total Test time (real) = 4.44 sec ``` Some of the tests take a long time to run so it might not be a good idea to enable them in CI, or perhaps we could only run a subset of the tests in CI.	2025-03-31 17:04:37 +02:00
Daniel Bevenius	e153b8eaa2	android.java : re-add ggml source updates (#2975 ) This commit updates the ggml source to include the new unary and binary operations. I merged https://github.com/ggerganov/whisper.cpp/pull/2958 which seems to have overwritten the changes to the ggml source which were added in https://github.com/ggerganov/whisper.cpp/pull/2972. Sorry about this. b2365	2025-03-31 16:14:33 +02:00
Daniel Bevenius	83af237f0b	ci : re-enable freeBDS-latest job (#2973 ) This commit re-enables the freeBSD-latest job which has been commented out. Refs: https://github.com/ggerganov/whisper.cpp/issues/2781	2025-03-31 15:24:08 +02:00
Daniel Bevenius	7a2e39750a	ci : re-enable android_java job (#2958 ) This commit re-enables the android_java job in the CI workflow. The job was disabled because of a failing build. The motivation for this is that Commit 226d344f565ea6140e7c6a583bc300a64454af58 ("whisper.android.java : update build with ggml source changes") addressed build issues and it should now be possible to re-enable this job.	2025-03-31 15:14:24 +02:00
Georgi Gerganov	0a40ae9728	android : add new ggml source files ggml-ci	2025-03-31 14:56:53 +03:00
Georgi Gerganov	32cfdcbf42	ruby : add new ggml sources ggml-ci	2025-03-31 14:56:53 +03:00
Georgi Gerganov	cfa42aca09	sync : ggml ggml-ci	2025-03-31 14:56:53 +03:00
Akarshan Biswas	2e2f0f954b	SYCL: Remove misleading ggml_sycl_op_flatten function (llama/12387) * SYCL: Remove misleading ggml_sycl_op_flatten function * remove trailing whitespace * Fix L2 norm from rebase * remove try catch block from element_wise.cpp * remove comment from common.hp * ggml-sycl.cpp: Add try catch sycl::exception block in compute_forward * norm.cpp: remove try catch exception block	2025-03-31 14:56:53 +03:00
Georgi Gerganov	93631b2be6	metal : use constexpr in FA kernels + fix typedef (llama/12659) * metal : use constexpr in FA kernels ggml-ci * cont ggml-ci * cont : fix typedef ggml-ci	2025-03-31 14:56:53 +03:00
R0CKSTAR	f9015b585b	musa: fix all warnings, re-enable `-DLLAMA_FATAL_WARNINGS=ON` in ci and update doc (llama/12611) * musa: fix all warnings Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: enable -DLLAMA_FATAL_WARNINGS=ON in run.sh Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: update ci doc (install ccache) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * fix Windows build issue Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Address review comments Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Address review comments Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-03-31 14:56:53 +03:00
Jay	1880ffd7ff	cmake : fix ccache conflict (llama/12522) If users already set CMAKE_C_COMPILER_LAUNCHER globally, setting it in cmake again will lead to conflict and compile fail. Signed-off-by: Jay <BusyJay@users.noreply.github.com>	2025-03-31 14:56:53 +03:00
Xuan-Son Nguyen	9173932c78	cpu : rm unused variable (ggml/1166)	2025-03-31 14:56:53 +03:00
cmdr2	94c3f3877f	cpu: de-duplicate some of the operators and refactor (ggml/1144) * cpu: de-duplicate some of the operators and refactor * Fix PR comments * Fix PR comments	2025-03-31 14:56:53 +03:00
Sandro Hanea	00086469fb	cmake: improve Vulkan cooperative matrix support checks (#2966 ) Co-authored-by: Sandro Hanea <me@sandro.rocks>	2025-03-31 13:44:36 +03:00
Daniel Bevenius	2d8e40e2a0	examples : update README links to point to pages deployment (#2971 ) This commit updates the README links to point to the pages deployment instead of whisper.ggerganov.com.	2025-03-31 12:32:27 +02:00
Daniel Bevenius	e17af6524f	ci : add github pages workflow for wasm examples (#2969 ) * ci : add github pages workflow for wasm examples This commit adds a github workflow to build and deploy the wasm examples to github pages. The whisper.wasm example is deployed as the main page. This workflow is trigged by a push to master and will deploy the examples to: https://ggerganov.github.io/whisper.cpp/. This requires that the repository has enabled github actions in `Settings` -> `Pages` -> `Build and deployment` -> `Source` be set to `GitHub Actions`. One thing to note is that this commit removes the `talk` example as I'm not sure how this example is built yet. Refs: https://github.com/ggerganov/whisper.cpp/issues/2784	2025-03-31 11:34:40 +02:00
Sacha Arbonel	88d13a17a7	feat: add health check endpoint to server (#2968 ) Some checks failed CI / ubuntu-22-clang (linux/amd64, Release) (push) Has been cancelled Details CI / ubuntu-22-clang (linux/arm64, Debug) (push) Has been cancelled Details CI / ubuntu-22-clang (linux/arm64, Release) (push) Has been cancelled Details CI / ubuntu-22-clang (linux/ppc64le, Debug) (push) Has been cancelled Details CI / ubuntu-22-clang (linux/ppc64le, Release) (push) Has been cancelled Details CI / ubuntu-22-gcc-sanitized (linux/amd64, ADDRESS) (push) Has been cancelled Details CI / ubuntu-22-gcc-sanitized (linux/amd64, THREAD) (push) Has been cancelled Details CI / ubuntu-22-gcc-sanitized (linux/amd64, UNDEFINED) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Has been cancelled Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Has been cancelled Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Has been cancelled Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Has been cancelled Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Has been cancelled Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Has been cancelled Details CI / windows-cublas (x64, Release, ON, 11.8.0, ON, 2.28.5) (push) Has been cancelled Details CI / windows-cublas (x64, Release, ON, 12.2.0, ON, 2.28.5) (push) Has been cancelled Details CI / emscripten (Release) (push) Has been cancelled Details CI / android (push) Has been cancelled Details CI / quantize (push) Has been cancelled Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64 tag:main]) (push) Has been cancelled Details CI / ios-xcode-build (Release) (push) Has been cancelled Details CI / release (push) Has been cancelled Details	2025-03-31 11:03:41 +03:00
Daniel Bevenius	f92bd59951	whisper : remove unnecessary GGML_UNUSED macro (#2960 ) Some checks failed CI / ubuntu-22-clang (linux/arm64, Debug) (push) Waiting to run Details CI / ubuntu-22-clang (linux/arm64, Release) (push) Waiting to run Details CI / ubuntu-22-clang (linux/ppc64le, Debug) (push) Waiting to run Details CI / ubuntu-22-clang (linux/ppc64le, Release) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, ADDRESS) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, THREAD) (push) Waiting to run Details CI / ubuntu-22-gcc-sanitized (linux/amd64, UNDEFINED) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Waiting to run Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Waiting to run Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Waiting to run Details CI / windows-cublas (x64, Release, ON, 11.8.0, ON, 2.28.5) (push) Waiting to run Details CI / windows-cublas (x64, Release, ON, 12.2.0, ON, 2.28.5) (push) Waiting to run Details CI / emscripten (Release) (push) Waiting to run Details CI / ios-xcode-build (Release) (push) Blocked by required conditions Details CI / android (push) Waiting to run Details CI / quantize (push) Waiting to run Details CI / release (push) Blocked by required conditions Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64 tag:main]) (push) Waiting to run Details Bindings Tests (Ruby) / ubuntu-22 (push) Has been cancelled Details	2025-03-30 05:56:10 +02:00
Georgi Gerganov	6e7629b146	sync : ggml Some checks failed CI / ubuntu-22-clang (linux/amd64, Release) (push) Has been cancelled Details CI / ubuntu-22-clang (linux/arm64, Debug) (push) Has been cancelled Details CI / ubuntu-22-clang (linux/arm64, Release) (push) Has been cancelled Details CI / ubuntu-22-clang (linux/ppc64le, Debug) (push) Has been cancelled Details CI / ubuntu-22-clang (linux/ppc64le, Release) (push) Has been cancelled Details CI / ubuntu-22-gcc-sanitized (linux/amd64, ADDRESS) (push) Has been cancelled Details CI / ubuntu-22-gcc-sanitized (linux/amd64, THREAD) (push) Has been cancelled Details CI / ubuntu-22-gcc-sanitized (linux/amd64, UNDEFINED) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Has been cancelled Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Has been cancelled Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Has been cancelled Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Has been cancelled Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Has been cancelled Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Has been cancelled Details CI / windows-cublas (x64, Release, ON, 11.8.0, ON, 2.28.5) (push) Has been cancelled Details CI / windows-cublas (x64, Release, ON, 12.2.0, ON, 2.28.5) (push) Has been cancelled Details CI / emscripten (Release) (push) Has been cancelled Details CI / android (push) Has been cancelled Details CI / quantize (push) Has been cancelled Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64 tag:main]) (push) Has been cancelled Details CI / ios-xcode-build (Release) (push) Has been cancelled Details CI / release (push) Has been cancelled Details ggml-ci	2025-03-28 21:47:42 +02:00
Georgi Gerganov	27533e7f63	metal : improve FA + improve MoE (llama/12612) * ggml : FA with different K, V head sizes (CPU) ggml-ci * metal : add FA with HS=192 * metal : extend FA to support different K and V head sizes ggml-ci * metal : add FA vector kernels for heads K 192 and V 128 ggml-ci * ggml : restrict op on other backends to equal head sizes ggml-ci * metal : optimize FA-vec kernel ggml-ci * metal : FA remove mq registers * metal : improve MoE mul_mat_id condition ggml-ci * metal : fix comments + remove unnecessary addition ggml-ci * metal : avoid too much shared memory usage with mul_mat_id ggml-ci	2025-03-28 21:47:42 +02:00
Icenowy Zheng	1b81415963	vulkan: fix coopmat shader generation when cross-compiling (llama/12272) * vulkan: fix coopmat shader generation when cross-compiling Previously the status of coopmat{,2} support isn't passed to the vulkan-shaders-gen project building on the host, which leads to build failure because of the cross-compiling code expecting coopmat{,2} shaders that didn't get generated. Fix this by passing the coopmat{,2} support status to vulkan-shaders subproject. Signed-off-by: Icenowy Zheng <uwu@icenowy.me> * Only call coop-mat shaders once * Fix whitespace --------- Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Co-authored-by: bandoti <141645996+bandoti@users.noreply.github.com>	2025-03-28 21:47:42 +02:00
amritahs-ibm	0001ec075f	llamafile : ppc64le GEMV forwarding for FP32. (llama/12594) This patch enables usage of MMA when one of the dimensions of the matrix(ie either M or N) is 1. This is useful in case of token generation where N < 2. The concept of 'GEMV Forwarding' is used where when one of the matrix has a single row/column, the elements are broadcasted, instead of using packing routine to prepack the matrix elements. This change results in 5% - 15% improvement in total speed(ie all tokens/total time), across various batch sizes. This is in comparision with the corresponding dot product implementation. The patch is tested with FP32 models of Meta-Lllama-3-8B, Mistral-7B, Llama-2-7B-chat-hf on a IBM POWER10 machine. Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>	2025-03-28 21:47:42 +02:00

1 2 3 4 5 ...

2394 Commits