whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-04-04 09:59:11 +00:00

Author	SHA1	Message	Date
Daniel Bevenius	e0be0de1ee	ggml : add check for grad_accs (ggml/1046) * ggml : add check for grad_accs This commit adds a check for grad_accs in ggml_graph_get_grad and ggml_graph_get_grad_acc functions. This is necessary to avoid segfaults when grad_accs is not initialized. The motivation for this change is that I find it nice to be able to print out a computation graph using ggml_graph_print but this function segfaults when grad_accs is not initialized: ```console (gdb) p g1 $2 = (ggml_cgraph ) 0x7ffff66004b0 (gdb) p g1 $3 = {size = 2048, n_nodes = 1, n_leafs = 2, nodes = 0x7ffff6600500, grads = 0x0, grad_accs = 0x0, leafs = 0x7ffff6604500, visited_hash_set = {size = 4099, used = 0x7ffff6610518, keys = 0x7ffff6608500}, order = GGML_CGRAPH_EVAL_ORDER_LEFT_TO_RIGHT} (gdb) p ggml_graph_print(g1) === GRAPH === n_nodes = 1 Program received signal SIGSEGV, Segmentation fault. 0x0000555555579775 in ggml_graph_get_grad (cgraph=0x7ffff66004b0,node=0x7ffff6600340) at /ggml/ggml/src/ggml.c:5990 5990 return igrad != GGML_HASHSET_FULL && ggml_bitset_get(cgraph->visited_hash_set.used, igrad) ? cgraph->grads[igrad] : NULL; ``` * squash! ggml : add check for grad_accs Fix the check in ggml_graph_get_grad. The check was incorrectly using cgraph->grad_accs instead of cgraph->grads.	2024-12-18 12:52:16 +02:00
Georgi Gerganov	60dc6d003f	common : remove old types ggml-ci	2024-12-18 12:52:16 +02:00
Johannes Gäßler	eb27e0d834	CUDA: fix shared memory access condition for mmv (llama/10740)	2024-12-18 12:52:16 +02:00
Jeff Bolz	a682fdce0c	vulkan: fix compile warnings (llama/10731)	2024-12-18 12:52:16 +02:00
stduhpf	9ffbd3d969	Vulkan: fix NaN in tanh.comp with AMD proprietary driver on Windows (llama/10723) * Vulkan: fix NaN in tanh.comp * Faster NaN-free tanh	2024-12-18 12:52:16 +02:00
Jeff Bolz	6585a890b4	vulkan: compile a test shader in cmake to check for coopmat2 support (llama/10713)	2024-12-18 12:52:16 +02:00
Georgi Gerganov	d0a050b51f	ggml : disable iq4_nl interleave size 8 (llama/10709) ggml-ci	2024-12-18 12:52:16 +02:00
Djip007	e990d1b791	ggml : refactor online repacking (llama/10446) * rename ggml-cpu-aarch64.c to .cpp * reformat extra cpu backend. - clean Q4_0_N_M and IQ4_0_N_M - remove from "file" tensor type - allow only with dynamic repack - extract cpu extra bufts and convert to C++ - hbm - "aarch64" - more generic use of extra buffer - generalise extra_supports_op - new API for "cpu-accel": - amx - aarch64 * clang-format * Clean Q4_0_N_M ref Enable restrict on C++ * add op GGML_OP_MUL_MAT_ID for Q4_0_N_M with runtime repack * added/corrected control on tensor size for Q4 repacking. * Update ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add debug logs on repacks. --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-12-18 12:52:16 +02:00
0cc4m	4a6d52efe6	Vulkan: VK_KHR_cooperative_matrix support to speed up prompt processing (llama/10597) * Vulkan: Implement VK_KHR_cooperative_matrix support in the matrix matrix multiplication shader * Improve performance with better q4_k and q5_k dequant and store unrolling * Add Vulkan MUL_MAT and MUL_MAT_ID accumulator precision selection * Rework mulmat shader selection and compilation logic, avoid compiling shaders that won't get used by device * Vulkan: Implement accumulator switch for specific mul mat mat shaders * Vulkan: Unroll more loops for more mul mat mat performance * Vulkan: Add VK_AMD_shader_core_properties2 support to read Compute Unit count for split_k logic * Disable coopmat support on AMD proprietary driver * Remove redundant checks * Add environment variable GGML_VK_DISABLE_COOPMAT to disable VK_KHR_cooperative_matrix support * Fix rebase typo * Fix coopmat2 MUL_MAT_ID pipeline selection	2024-12-18 12:52:16 +02:00
Robert Ormandi	8b841d430a	metal : Extend how Llama.cpp locates metal resources (llama/10676) * metal : Extend how Llama.cpp locates metal resources (llama/10675) * It searches the resource file in the directory where the current binary is located as well. * Resolves symbolic links. Rationale: When we plug this dependency into a Bazel build and run it in the context of Bazel (e.g. testing): * the execution directory is often very different from where the files are located and no direct control over this (Bazel sandboxing), * the Bazel sandbox often use symbolic links to make files available. With this patch, we can have the resource file added to the target, can build and run tests in the context of Bazel. * Update ggml/src/ggml-metal/ggml-metal.m Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update ggml/src/ggml-metal/ggml-metal.m Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-12-18 12:52:16 +02:00
Jeff Bolz	b74b68212a	vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (llama/10206)	2024-12-18 12:52:16 +02:00
KITAITI Makoto	3a27b2b91b	ruby : Add no_speech_thold (#2641 ) * Remove Whisper::Model.[] * Fix Whisper::Model::URI#request * Make Whisper::Context#initialize accept pre-converted model name * Use downloading pre-converted model feature for testing * Update README * Remove unnecessary task * Move whisper/model.rb -> whisper/model/uri.rb * Update document comment of Whisper::Context#initialize * Don't show download progress when not tty * Pass String to raise * Use cache model file if download fails * Add test for auto download * Specify required Ruby version * Fix a typo * Remove unnecessary flags * Initialize Whisper::Params#diarize explicitely * Remove redundant code from README for simplicity * Add Whisper::Params#no_speech_thold attribute * Add test for Whisper::Params#no_speech_thold	2024-12-18 11:00:50 +02:00
crummyh	d34445e960	stream : improve consistency in README (#2642 )	2024-12-18 08:43:48 +02:00
Karthick	f897eb7670	whisper : support no_speech_thold (#2625 ) Some checks are pending CI / ubuntu-latest-gcc (linux/ppc64le, Debug) (push) Waiting to run Details CI / ubuntu-latest-gcc (linux/ppc64le, Release) (push) Waiting to run Details CI / ubuntu-latest-clang (linux/amd64, Debug) (push) Waiting to run Details CI / ubuntu-latest-clang (linux/amd64, Release) (push) Waiting to run Details CI / ubuntu-latest-clang (linux/arm64, Debug) (push) Waiting to run Details CI / ubuntu-latest-clang (linux/arm64, Release) (push) Waiting to run Details CI / ubuntu-latest-clang (linux/ppc64le, Debug) (push) Waiting to run Details CI / ubuntu-latest-clang (linux/ppc64le, Release) (push) Waiting to run Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, ADDRESS) (push) Waiting to run Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, THREAD) (push) Waiting to run Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, UNDEFINED) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Waiting to run Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Waiting to run Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Waiting to run Details CI / emscripten (Release) (push) Waiting to run Details CI / ios-xcode-build (Release) (push) Waiting to run Details CI / android (push) Waiting to run Details CI / quantize (push) Waiting to run Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64,linux/arm64 tag:main]) (push) Waiting to run Details * Implement no_speech_thold no_speech_thold functionality is on par with OpenAI's whisper * Addressed review comments	2024-12-17 19:15:47 +02:00
Karthick	2f2841bfce	whisper : add single-timestamp logic (#2629 ) * Fix hallucinations during silence When the predicted tokens end with a single timestamp the the entire 30 segment should be considered as done, to avoid hallucinations for the remaining part of segment. This behaviour is on par with openai's whisper. Refer to logic related to `single_timestamp_ending` in https://github.com/openai/whisper/blob/main/whisper/transcribe.py * Accept review comments related to formatting. Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-12-17 19:07:08 +02:00
crummyh	09a1b61218	readme : fix typo (#2637 )	2024-12-17 19:05:35 +02:00
Georgi Gerganov	94e7da1ff2	cmake : fix "amd64" processor string (#2638 )	2024-12-17 18:34:32 +02:00
gn64	c4aed6831e	vulkan : fix soft_max.comp division by zero (#2633 ) Some checks failed CI / ubuntu-latest-gcc (linux/ppc64le, Debug) (push) Has been cancelled Details CI / ubuntu-latest-gcc (linux/ppc64le, Release) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/amd64, Debug) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/amd64, Release) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/arm64, Debug) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/arm64, Release) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/ppc64le, Debug) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/ppc64le, Release) (push) Has been cancelled Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, ADDRESS) (push) Has been cancelled Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, THREAD) (push) Has been cancelled Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, UNDEFINED) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Has been cancelled Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Has been cancelled Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Has been cancelled Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Has been cancelled Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Has been cancelled Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Has been cancelled Details CI / emscripten (Release) (push) Has been cancelled Details CI / ios-xcode-build (Release) (push) Has been cancelled Details CI / android (push) Has been cancelled Details CI / quantize (push) Has been cancelled Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64,linux/arm64 tag:main]) (push) Has been cancelled Details This change prevents a division by zero error when p.KY is 0.	2024-12-16 12:34:38 +02:00
Georgi Gerganov	199579652e	common : add cstdio header	2024-12-16 08:57:04 +02:00
Georgi Gerganov	d17e7139d8	stream : update build instructions Some checks are pending CI / ubuntu-latest-gcc (linux/ppc64le, Debug) (push) Waiting to run Details CI / ubuntu-latest-gcc (linux/ppc64le, Release) (push) Waiting to run Details CI / ubuntu-latest-clang (linux/amd64, Debug) (push) Waiting to run Details CI / ubuntu-latest-clang (linux/amd64, Release) (push) Waiting to run Details CI / ubuntu-latest-clang (linux/arm64, Debug) (push) Waiting to run Details CI / ubuntu-latest-clang (linux/arm64, Release) (push) Waiting to run Details CI / ubuntu-latest-clang (linux/ppc64le, Debug) (push) Waiting to run Details CI / ubuntu-latest-clang (linux/ppc64le, Release) (push) Waiting to run Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, ADDRESS) (push) Waiting to run Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, THREAD) (push) Waiting to run Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, UNDEFINED) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Waiting to run Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Waiting to run Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Waiting to run Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Waiting to run Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Waiting to run Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Waiting to run Details CI / emscripten (Release) (push) Waiting to run Details CI / ios-xcode-build (Release) (push) Waiting to run Details CI / android (push) Waiting to run Details CI / quantize (push) Waiting to run Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64,linux/arm64 tag:main]) (push) Waiting to run Details	2024-12-15 21:55:36 +02:00
Thamster	6a52eaea74	android : fix build and ci (#2624 ) Some checks failed CI / ubuntu-latest-gcc (linux/ppc64le, Debug) (push) Has been cancelled Details CI / ubuntu-latest-gcc (linux/ppc64le, Release) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/amd64, Debug) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/amd64, Release) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/arm64, Debug) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/arm64, Release) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/ppc64le, Debug) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/ppc64le, Release) (push) Has been cancelled Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, ADDRESS) (push) Has been cancelled Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, THREAD) (push) Has been cancelled Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, UNDEFINED) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Has been cancelled Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Has been cancelled Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Has been cancelled Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Has been cancelled Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Has been cancelled Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Has been cancelled Details CI / emscripten (Release) (push) Has been cancelled Details CI / ios-xcode-build (Release) (push) Has been cancelled Details CI / android (push) Has been cancelled Details CI / quantize (push) Has been cancelled Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64,linux/arm64 tag:main]) (push) Has been cancelled Details * Adding missing CMakeLists.txt include for ggm-cpu needed by whisper.android * attempt to re-enable CI for JNI android --------- Co-authored-by: Your Name <you@example.com>	2024-12-14 17:25:53 +02:00
Michael Rienstra	6aa1d7b892	models : fix typo in download-ggml-model.sh (#2623 ) Some checks failed CI / ubuntu-latest-gcc (linux/arm64, Release) (push) Has been cancelled Details CI / ubuntu-latest-gcc (linux/ppc64le, Debug) (push) Has been cancelled Details CI / ubuntu-latest-gcc (linux/ppc64le, Release) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/amd64, Debug) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/amd64, Release) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/arm64, Debug) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/arm64, Release) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/ppc64le, Debug) (push) Has been cancelled Details CI / ubuntu-latest-clang (linux/ppc64le, Release) (push) Has been cancelled Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, ADDRESS) (push) Has been cancelled Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, THREAD) (push) Has been cancelled Details CI / ubuntu-latest-gcc-sanitized (linux/amd64, UNDEFINED) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Has been cancelled Details CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled Details CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Has been cancelled Details CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Has been cancelled Details CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Has been cancelled Details CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Has been cancelled Details CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Has been cancelled Details CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Has been cancelled Details CI / emscripten (Release) (push) Has been cancelled Details CI / ios-xcode-build (Release) (push) Has been cancelled Details CI / quantize (push) Has been cancelled Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64,linux/arm64 tag:main]) (push) Has been cancelled Details Introduced in #2589	2024-12-12 18:02:00 +02:00
KITAITI Makoto	262e865a70	ruby : Sync whisper.cpp and model download feature (#2617 ) * Use C++17 * Add test for Pathname of model * Make Whisper::Context#initialize accept Pathname * Add shorthand for pre-converted models * Update documents * Add headings to API section in README [skip ci] * Remove unused function * Don't care about no longer included file * Cosmetic fix * Use conditional get when get model files	2024-12-09 13:17:50 +02:00
Georgi Gerganov	ed733e85a1	scripts : update to new build system v1.7.3-pre	2024-12-09 11:30:16 +02:00
Georgi Gerganov	5980b1ae77	devops : add cmake	2024-12-08 23:09:26 +02:00
Georgi Gerganov	0415a66044	devops : update make commands	2024-12-08 23:07:29 +02:00
Georgi Gerganov	7d134e3737	ggml : remove old files (skip) (#0 )	2024-12-08 23:04:26 +02:00
Georgi Gerganov	9df53b357e	ggml : sync remnants (skip) (#0 )	2024-12-08 22:48:25 +02:00
Georgi Gerganov	b2115b4d9b	scripts : remove amx from sync	2024-12-08 22:48:14 +02:00
Georgi Gerganov	0164427dd5	ci : disable freeBSD builds [no ci]	2024-12-08 20:14:35 +02:00
Georgi Gerganov	627b11c78a	readme : update build instructions	2024-12-08 20:14:35 +02:00
Georgi Gerganov	472464453d	ci : disable CUDA and Android builds	2024-12-08 20:14:35 +02:00
Georgi Gerganov	11dddfbc9e	ci : disable Obj-C build + fixes	2024-12-08 20:14:35 +02:00
Georgi Gerganov	384e214cc7	make : shim cmake	2024-12-08 20:14:35 +02:00
Georgi Gerganov	f2c680f893	talk-llama : sync llama.cpp	2024-12-08 20:14:35 +02:00
Georgi Gerganov	fbe66da0e5	sync : ggml	2024-12-08 20:14:35 +02:00
Diego Devesa	a815940e0e	ggml : add predefined list of CPU backend variants to build (llama/10626) * ggml : add predefined list of CPU backend variants to build * update CPU dockerfiles	2024-12-08 20:14:35 +02:00
Diego Devesa	904e307bce	ggml-cpu : fix HWCAP2_I8MM value (llama/10646)	2024-12-08 20:14:35 +02:00
Jeff Bolz	491ec076b4	vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (llama/10642)	2024-12-08 20:14:35 +02:00
Nicolò Scipione	966433fdf2	SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (llama/10584) * [SYCL] Move to Compile Time backend selection on oneMKL Interface for NVIDIA backend Move to compile time selection to backend to avoid latency at run time. Add it to all mkl gemm calls and only for NVIDIA backend. Signed-off-by: nscipione <nicolo.scipione@codeplay.com> * Formatting * Address PR comments to increase readibility --------- Signed-off-by: nscipione <nicolo.scipione@codeplay.com>	2024-12-08 20:14:35 +02:00
Frankie Robertson	6f1ba9d82d	Avoid using __fp16 on ARM with old nvcc (llama/10616)	2024-12-08 20:14:35 +02:00
Jeff Bolz	015ecd0001	vulkan: optimize and reenable split_k (llama/10637) Use vector loads when possible in mul_mat_split_k_reduce. Use split_k when there aren't enough workgroups to fill the shaders.	2024-12-08 20:14:35 +02:00
PAB	b7c64a4352	ggml: add `GGML_SET` Metal kernel + i32 CPU kernel (ggml/1037) * implemented cpu kernel * add i32 test cases in test-backend-ops * typedef `ggml_metal_kargs_set` * implemented `kernel_set` * memcpy	2024-12-08 20:14:35 +02:00
PAB	7895d39508	ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034) * ggml_pad_reflect_1d defined in header * implemented on CPU * called the forward pass * impl Metal kernel * added Metal kernel * added OP_PAD_REFLECT_1D in test-backend-ops.cpp * add test-pad-reflect-1d test case * test case support multiple backend	2024-12-08 20:14:35 +02:00
Georgi Gerganov	22616f00f9	files : remove make artifacts	2024-12-08 20:14:35 +02:00
Georgi Gerganov	02c6fcbc2c	common : fix compile warning ggml-ci	2024-12-08 20:14:35 +02:00
Diego Devesa	3daeacad24	ggml : move AMX to the CPU backend (llama/10570) ggml : automatic selection of best CPU backend (llama/10606)	2024-12-08 20:14:35 +02:00
Georgi Gerganov	4d73962da4	metal : small-batch mat-mul kernels (llama/10581) * metal : small-batch mat-mul kernels ggml-ci * metal : add rest of types ggml-ci * metal : final adjustments ggml-ci * metal : add comments ggml-ci	2024-12-08 20:14:35 +02:00
Akarshan Biswas	068812650e	SYCL: Fix and switch to GGML_LOG system instead of fprintf (llama/10579) * Switched to GGML_LOG * Fix missing semicolon	2024-12-08 20:14:35 +02:00
Adrien Gallouët	4b7e059e15	ggml-cpu: replace AArch64 NEON assembly with intrinsics in ggml_gemv_q4_0_4x4_q8_0() (llama/10567) Signed-off-by: Adrien Gallouët <angt@huggingface.co>	2024-12-08 20:14:35 +02:00

1 2 3 4 5 ...

1961 Commits