whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2024-12-20 05:07:52 +00:00

Author	SHA1	Message	Date
Paul Tsochantaris	80753d4da8	metal : single allocation of encode_async block (llama/9747) * Single allocation of encode_async block with non-ARC capture in ggml-metal.m * Moving Block_release to the deallocation code * Release encode block when re-setting encoding buffer count if needed * Update ggml/src/ggml-metal.m --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-11-01 10:19:05 +02:00
Daniel Bevenius	8f9bdca4c4	ggml-alloc : remove buffer_id from leaf_alloc (ggml/987) This commit removes the buffer_id field from the leaf_alloc struct. The motivation for is that this field is only written to and never read/used as far as I can tell. Each tensor_alloc has a buffer_id field and this is what caused me to look into this more closely, to understand what the buffer_id in leaf_alloc was used for.	2024-11-01 10:19:05 +02:00
Georgi Gerganov	4e10afb5a9	scripts : sync amx	2024-10-31 22:13:24 +02:00
Georgi Gerganov	aa037a60f3	ggml : alloc ggml_contexts on the heap (#2525 ) * whisper : reduce ggml_context usage * ggml : allocate contexts on the heap (v2) * ggml : aligned malloc -> malloc	2024-10-31 22:00:09 +02:00
Georgi Gerganov	19dca2bb14	ci : fix openblas build (#2511 ) * ci : fix openblas build * cont : would this work? * ci : I'm sorry, windows * cont : disabled wrong build * ci : fix openblas build with pkgconfiglite (#2517) - choco install pkgconfiglite (vcpkg-pkgconf doesn't contain pkg-config executable?) - vcpkg install openblas (otherwise it is not detected now) --------- Co-authored-by: Tamotsu Takahashi <ttakah+github@gmail.com>	2024-10-30 12:58:26 +02:00
Georgi Gerganov	55e422109b	scripts : add turbo-q8_0 to the benchmark	2024-10-29 19:37:24 +02:00
Georgi Gerganov	3f020fac9d	whisper : minor compile warning	2024-10-29 19:30:26 +02:00
jettoblack	1626b73b03	whisper : move new-segment callback after DTW step (#2515 )	2024-10-29 08:47:21 +02:00
KITAITI Makoto	850f7b19d3	ruby : fix installation test (#2519 )	2024-10-29 08:45:37 +02:00
KITAITI Makoto	d4bc413505	ruby : add more APIs (#2518 ) * Add test for built package existence * Add more tests for Whisper::Params * Add more Whisper::Params attributes * Add tests for callbacks * Add progress and abort callback features * [skip ci] Add prompt usage in README * Change prompt text in example	2024-10-28 19:23:23 +02:00
KITAITI Makoto	fc49ee4479	ruby : support new-segment callback (#2506 ) * Add Params#new_segment_callback= method * Add tests for Params#new_segment_callback= * Group tests for #transcribe * Don't use static for thread-safety * Set new_segment_callback only when necessary * Remove redundant check * [skip ci] Add Ruby version README * Revert "Group tests for #transcribe" This reverts commit `71b65b00cc`. * Revert "Add tests for Params#new_segment_callback=" This reverts commit `81e6df3bab`. * Add test for Context#full_n_segments * Add Context#full_n_segments * Add tests for lang API * Add lang API * Add tests for Context#full_lang_id API * Add Context#full_lang_id * Add abnormal test cases for lang * Raise appropriate errors from lang APIs * Add tests for Context#full_get_segment_t{0,1} API * Add Context#full_get_segment_t{0,1} * Add tests for Context#full_get_segment_speaker_turn_next API * Add Context#full_get_segment_speaker_turn_next * Add tests for Context#full_get_segment_text * Add Context#full_get_setgment_text * Add tests for Params#new_segment_callback= * Run new segment callback * Split tests to multiple files * Use container struct for new segment callback * Add tests for Params#new_segment_callback_user_data= * Add Whisper::Params#new_user_callback_user_data= * Add GC-related test for new segment callback * Protect new segment callback related structs from GC * Add meaningful test for build * Rename: new_segment_callback_user_data -> new_segment_callback_container * Add tests for Whisper::Segment * Add Whisper::Segment and Whisper::Context#each_segment * Extract c_ruby_whisper_callback_container_allocate() * Add test for Whisper::Params#on_new_segment * Add Whisper::Params#on_new_egment * Assign symbol IDs to variables * Make extsources.yaml simpler * Update README * Add document comments * Add test for calling Whisper::Params#on_new_segment multiple times * Add file dependencies to GitHub actions config and .gitignore * Add more files to ext/.gitignore	2024-10-28 15:43:27 +02:00
KITAITI Makoto	c0ea41f6b2	ruby : add Metal support (#2516 )	2024-10-28 13:08:09 +02:00
Josscii	0fbaac9c89	whisper : fix index overflow in token-level timestamp logic (#2505 )	2024-10-23 15:14:03 +03:00
toboil-features	a5abfe6a90	readme : update links and make commands (#2489 ) * Update links to headers in README.md * Add link to Vulkan section in README.md * Add "-j" for parallelism for "make" in README.md * Update README.md	2024-10-17 13:25:18 +03:00
KITAITI Makoto	d3f7137cc9	ruby : fix bindings (#2484 ) * Improve Rakefile * Remove intermediate files * Remove unnecessary manipulations from extconf.rb * Add README and LINCENSE to source files * Manage ext source files using YAML file * Use extsources.yaml to include files into gem package file * Add git-managed source files to build dependency * Add test task * Download model for test if not exists * Add test for build * Ignore gem package directory * Enable GitHub action for Ruby binding * Fix model name * Build lib file for test * Use extension for each platform * Use extension for each platform on testing * Move built lib file rather than copy * Add intermediate files to clean targets	2024-10-16 18:44:04 +03:00
toboil-features	f7c99e49b3	readme : add Vulkan notice (#2488 ) * Add Vulkan notice in README.md * Fix formatting for Vulkan section in README.md * Fix formatting in README.md	2024-10-16 18:43:26 +03:00
Georgi Gerganov	1d5752fa42	make : fix GGML_VULKAN=1 build (#2485 )	2024-10-16 18:42:47 +03:00
Rotem Dan	b6049060dd	whisper : add dtw preset for large-v3-turbo (#2481 )	2024-10-15 21:00:21 +03:00
CrispStrobe	06a1da9daf	convert : handle max_target_positions (#2477 ) as needed eg for https://huggingface.co/primeline/whisper-large-v3-turbo-german/blob/main/config.json	2024-10-14 10:46:33 +03:00
Salman Faroz	746d173592	readme : update the Quick Start section (#2475 ) navigating into the directory	2024-10-14 10:44:57 +03:00
Sandro Hanea	fdbfb460ed	whisper : add OpenVINO init with state (#2464 ) * Fixed OpenVino init on state * Removed an empty line * Fixed typo * Replaced tabs with spaces --------- Co-authored-by: Sandro Hanea <sandrohanea@users.noreply.github.com>	2024-10-08 20:08:00 +03:00
Georgi Gerganov	ebca09a3d1	release : v1.7.1	2024-10-07 13:06:48 +03:00
SRHMorris	9f346d0084	vulkan : retry allocation with fallback flags (#2451 ) Co-authored-by: Samuel Morris <samuel.morris@artlist.io>	2024-10-06 10:34:20 +03:00
Georgi Gerganov	6a94163b91	release : v1.7.0	2024-10-05 16:43:26 +03:00
Georgi Gerganov	8a35b58c4f	scripts : bench v3-turbo	2024-10-05 16:22:53 +03:00
Georgi Gerganov	1789abca84	whisper : remove mel leftover constants (`396089f`)	2024-10-05 16:13:03 +03:00
Georgi Gerganov	847f94fdeb	whisper : zero-out the KV cache upon clear (#2445 )	2024-10-05 15:23:51 +03:00
Georgi Gerganov	6e40108a59	objc : fix build	2024-10-05 15:23:51 +03:00
Georgi Gerganov	1ba185f4af	metal : zero-init buffer contexts (#0 )	2024-10-05 15:23:51 +03:00
Georgi Gerganov	396089f3cf	whisper : revert mel-related changes (#0 ) too much extra logic and complexity for small benefit	2024-10-05 15:23:51 +03:00
Georgi Gerganov	941912467d	whisper : adapt to latest ggml (skip) (#0 )	2024-10-05 15:23:51 +03:00
Daniel Bevenius	0b1b094a67	ggml : fix typo in example usage ggml_gallocr_new (ggml/984)	2024-10-05 15:23:51 +03:00
Diego Devesa	40e52a76b9	ggml : fixes after sync (ggml/983) ggml : remove test-backend-buffer ggml : fix CUDA build warnings	2024-10-05 15:23:51 +03:00
Diego Devesa	cf977670e6	ggml-backend : add device and backend reg interfaces (llama/9707) Also: - metal : fix compute pass descriptor autorelease crash - ggml-backend : add device description to CPU backend - ggml: unify backend logging mechanism	2024-10-05 15:23:51 +03:00
Ouadie EL FAROUKI	df2c364de7	Fixed dequant precision issues in Q4_1 and Q5_1 (llama/9711)	2024-10-05 15:23:51 +03:00
Diego Devesa	1acfadb721	ggml-backend : add device and backend reg interfaces (llama/9707) Co-authored-by: Johannes Gäßler <johannesg@5d6.de>	2024-10-05 15:23:51 +03:00
Alberto Cabrera Pérez	ea642144d2	Initial cmake support of SYCL for AMD GPUs (llama/9658) sycl: initial cmake support of SYCL for AMD GPUs	2024-10-05 15:23:51 +03:00
Radoslav Gerganov	282a8654c4	vulkan : do not use tensor->extra (llama/9407) * vulkan : do not use tensor->extra This patch allows using the Vulkan backend with the RPC backend as tensor->extra is no longer used. Ref: #8536 * Adapt GGML_VULKAN_CHECK_RESULTS to extra removal (llama/2) --------- Co-authored-by: 0cc4m <picard12@live.de>	2024-10-05 15:23:51 +03:00
Johannes Gäßler	936cf3beb7	ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980)	2024-10-05 15:23:51 +03:00
Johannes Gäßler	bc92c2f8f0	ggml: refactor cross entropy loss CPU impl. (ggml/976)	2024-10-05 15:23:51 +03:00
Georgi Gerganov	f7d55e0614	scripts : sync ggml-backend.cpp	2024-10-05 15:23:51 +03:00
Georgi Gerganov	f62a546e03	whisper : fix excessive memory usage (#2443 ) * whisper : fix KV cache allocation * whisper : reduce memory overhead from unused input tensors	2024-10-05 12:36:40 +03:00
Rahul Vadhyar	2944cb72d9	examples : update dr_wav.h to newer version (#2449 )	2024-10-04 11:04:51 +03:00
Georgi Gerganov	ccc2547210	talk-llama : sync llama.cpp	2024-10-03 12:22:17 +03:00
Georgi Gerganov	162a455402	metal : reduce command encoding overhead (llama/9698)	2024-10-03 12:22:17 +03:00
Georgi Gerganov	ff2cb0811f	sync : ggml	2024-10-03 12:22:17 +03:00
Johannes Gäßler	5e9d6baa48	test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974)	2024-10-03 12:22:17 +03:00
Salvatore Mesoraca	845f8d663e	vulkan : mul_mat: fix UB with small warps (ggml/952) When the device's warp size is less than 16, it is possible for loadstride_a (mul_mm.comp:114) and loadstride_b (mul_mm.comp:115) to be set to 0. Because they are calculated as: the workgroup size, multiplied by LOAD_VEC_* (which can be 1) and divided by 16. And the workgroup size is set to be the same as the warp/subgroup size. The loadstride_* variables are used as increments in the loops that populate the buffers used for the multiplication. When they are 0 they cause an infinite loop. But infinite loops without side-effects are UB and the values of loadstride_* are known at compile time. So, the compiler quietly optimizes all the loops away. As a consequence, the buffers are not populated and the multiplication result is just a matrix with all elements set to 0. We prevent the UB by making sure that the workgroup size will never be less than 16, even if our device has a smaller warp size (e.g. 8). Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>	2024-10-03 12:22:17 +03:00
Borislav Stanimirov	31fdf05fda	ggml : fix ggml_cast (ggml/973)	2024-10-03 12:22:17 +03:00
Johannes Gäßler	0ac6666cd2	ggml: fix gradient allocation logic (ggml/966) * ggml: fix gradient allocation logic * gradient allocation in ggml_build_backward_expand * fixup * fix test-backend-ops grad * suggestions by slaren * fix test1.c * fix legacy opt API * fix test-grad0 * remove keep arg	2024-10-03 12:22:17 +03:00

... 4 5 6 7 8 ...

1975 Commits