whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2024-12-20 13:13:07 +00:00

Author	SHA1	Message	Date
Georgi Gerganov	8fb0a1cd1c	bench : fix build + fix go bindings	2023-11-07 13:20:02 +02:00
Georgi Gerganov	185d3fd6d9	whisper : add support for large v3	2023-11-07 11:58:39 +02:00
Georgi Gerganov	b629d2d4fe	cmake : fix talk-llama build	2023-11-07 11:03:21 +02:00
Georgi Gerganov	3bd7d48f51	metal : fix asserts for setThreadgroupMemoryLength (close #1435 )	2023-11-07 11:02:16 +02:00
iamthad	435a6b74e3	ci : fix variable names in GitHub actions config (#1440 ) * Remove _SUPPORT from variables * Change blasdir to OPENBLAS_PATH * Update OpenBLAS URLs	2023-11-07 10:53:24 +02:00
Jhen-Jie Hong	75dc800d21	talk-llama : fix n_gpu_layers usage again (#1442 )	2023-11-07 10:51:27 +02:00
Georgi Gerganov	0c91aef2d8	whisper : add missing about callback initializers	2023-11-07 10:49:51 +02:00
Jhen-Jie Hong	3989b29a9b	examples : fix n_gpu_layers usage in talk-llama (#1441 )	2023-11-07 01:36:23 +00:00
Jhen-Jie Hong	0463028bc2	whisper : add context param to disable gpu (#1293 ) * whisper : check state->ctx_metal not null * whisper : add whisper_context_params { use_gpu } * whisper : new API with params & deprecate old API * examples : use no-gpu param && whisper_init_from_file_with_params * whisper.objc : enable metal & disable on simulator * whisper.swiftui, metal : enable metal & support load default.metallib * whisper.android : use new API * bindings : use new API * addon.node : fix build & test * bindings : updata java binding * bindings : add missing whisper_context_default_params_by_ref WHISPER_API for java * metal : use SWIFTPM_MODULE_BUNDLE for GGML_SWIFT and reuse library load * metal : move bundle var into block * metal : use SWIFT_PACKAGE instead of GGML_SWIFT * style : minor updates --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-11-06 11:04:24 +02:00
Georgi Gerganov	39cfad0dee	whisper : add support for new distilled Whisper models (#1424 ) * whisper : add support for new distilled Whisper models * whisper : print log when using distilled models	2023-11-05 19:43:45 +02:00
Georgi Gerganov	6d4d0b5b4b	cuda : fix HIPBLAS build	2023-11-05 19:41:15 +02:00
Georgi Gerganov	f96e1c5b78	sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422 ) * sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) * metal : allow env metal variable to override resource path (#1415) * Allow env variable to override resource path * Update ggml-metal.m --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * sync : restore common / main from `master` * sync : restore whisper from `master` * talk-llama : update to latest llama.cpp * ruby : fix build * ggml : fix 32-bit ARM build * ggml : fix MIN / MAX macro collisions + update ios bindings * ggml : fix ifdefs and MIN / MAX again * exampels : fix Obj-C and Swift examples * ggml : fix 32-bit ARM compatibility * ggml : one more attempt to fix 32-bit ARM compat * whisper : fix support for larger graphs --------- Co-authored-by: Chris Raethke <codesoda@users.noreply.github.com>	2023-11-03 21:35:05 +02:00
bobqianic	8a2bee6717	models : use absolute paths for the converted model (#1356 )	2023-11-03 10:44:27 +02:00
Asad Memon	d445098c8f	talk-llama : move up-to-date demo to top (#1417 )	2023-11-02 18:50:13 +02:00
Georgi Gerganov	74de25158e	talk-llama : add an up-to-date demo video	2023-11-02 15:28:48 +02:00
Aarni Koskela	bce49a260e	examples : Implement JSON output for Token-Level data in main (#1358 )	2023-10-31 19:54:52 +00:00
WhiteOlivierus	45c87b5481	models : Faster download for models on windows using BitTransfer (#1404 )	2023-10-30 19:18:12 +00:00
ai-at-home	dfe4bc6e59	README : Update README in stream to clarify where to compile from (Issue #1400 ) * Clarify doc about where to compile from * Update examples/stream/README.md * Update examples/stream/README.md * Update README.md --------- Co-authored-by: AI @ Home <> Co-authored-by: bobqianic <129547291+bobqianic@users.noreply.github.com>	2023-10-29 17:11:13 +00:00
Johan	54c978c3a3	binding : Expose the audio_ctx param through the Go binding (#1368 ) * expose the audio_ctx param through the go binding * expose the audio_ctx param to the go binding context	2023-10-15 13:35:06 +01:00
jorismertz	9a7074d4aa	README : fix typo (#1362 )	2023-10-13 16:53:23 +01:00
joecryptotoo	a0040f5d12	docker : Add dockerfile for cublas (#1286 ) * Create Dockerfile * Rename Dockerfile to cublas.Dockerfile * Rename cublas.Dockerfile to .devops/cublas.Dockerfile --------- Co-authored-by: bobqianic <129547291+bobqianic@users.noreply.github.com>	2023-10-11 11:00:17 +01:00
mkiol	940cdb1396	whisper : abort callback improvements (#1345 ) * whisper : initialize abort_callback to null * whisper : add example how to use abort_callback	2023-10-08 17:22:24 +03:00
Marcin Mielniczuk	1b775cdd68	cmake : Abort the build if a requested feature could not be configured (#1350 )	2023-10-07 20:01:18 +01:00
Marcin Mielniczuk	80bf931668	cmake : Prefer pkg-config while looking for BLAS (#1349 )	2023-10-07 15:02:07 +01:00
Xiang (Kevin) Li	91c0b23384	models : add conversion scripts from HuggingFace models to CoreML (#1304 )	2023-10-04 12:00:25 +03:00
mkiol	2f668c330e	whisper : add abort callback (#1335 )	2023-10-04 11:57:55 +03:00
bobqianic	08fa34882f	examples : move wav_writer from stream.cpp to common.h (#1317 ) * Allocate class on the stack instead of on the heap * Add class wav_writer * fix some minor issues * fix some minor issues * remove potential misleading API	2023-10-03 22:56:11 +03:00
Didzis Gosko	4037705531	whisper : add missing speaker turn API function for whisper_state (#1330 )	2023-10-03 22:55:48 +03:00
brunofaustino	c76c11e59c	examples: Update the README for Talk - fixing the gpt2 URL (#1334 )	2023-10-01 04:21:32 +08:00
Neil Chudleigh	9edbd0a204	extra: Add benchmark script implemented in Python (#1298 ) * Create bench.py * Various benchmark results * Update benchmark script with hardware name, and file checks * Remove old benchmark results * Add git shorthash * Round to 2 digits on calculated floats * Fix the header reference when sorting results * FIx order of models * Parse file name * Simplify filecheck * Improve print run print statement * Use simplified model name * Update benchmark_results.csv * Process single or lists of processors and threads * Ignore benchmark results, dont check in * Move bench.py to extra folder * Readme section on how to use * Move command to correct location * Use separate list for models that exist * Handle subprocess error in git short hash check * Fix filtered models list initialization	2023-09-25 23:45:15 +08:00
litong	707507ff6d	Examples: Add save audio to file option in stream.cpp (#1310 ) * save the recorded audio to a file * Alignment -help * Save the correct audio * chage to a consistent coding style * Correct typo * Update examples/stream/stream.cpp * Update examples/stream/stream.cpp * Correct variable misuse * Update examples/stream/stream.cpp * Update examples/stream/stream.cpp * Update examples/stream/stream.cpp * Update examples/stream/stream.cpp --------- Co-authored-by: bobqianic <129547291+bobqianic@users.noreply.github.com>	2023-09-22 23:43:21 +08:00
JJ	7e1592d2cd	readme: Fix spelling error (#1290 ) Fixed branding error: Javascript to JavaScript	2023-09-21 15:55:33 +08:00
Artyom Mezin	903c9579b8	examples: Update README.md of main.cpp (#1306 )	2023-09-18 22:14:36 +08:00
Jhen-Jie Hong	b440ef8c96	binding : fix ruby build by adding missing ggml-alloc (#1305 )	2023-09-18 21:15:45 +08:00
Evgeny Kuznetsov	700f63a806	bench: fix missing include <cstring> (#1303 )	2023-09-18 15:51:10 +08:00
Georgi Gerganov	951a119926	whisper : increase tokenizer buffer (close #1259 )	2023-09-15 21:11:43 +03:00
Georgi Gerganov	1ca4041b86	talk-llama : update to latest llama.cpp	2023-09-15 20:06:31 +03:00
Georgi Gerganov	80c1512fd5	sync : ggml (const correctness)	2023-09-15 14:49:56 +03:00
Georgi Gerganov	0ac9cefd03	metal : restore matrix x vector f16_f32 kerenls for now	2023-09-15 14:40:41 +03:00
Georgi Gerganov	b8432f28f4	metal : add F32 support + update bench output	2023-09-15 13:56:08 +03:00
Georgi Gerganov	93935980f8	whisper : Metal and ggml-alloc support (#1270 ) * metal : init * whisper : factor out graph builds * whisper : allocate encoder and decoder using ggml-alloc * whisper : ggml-alloc is now supported * whisper : CoreML support ggml-alloc * build : fix ggml-alloc * ios : update submodule * extra : update sync-ggml.sh script to also sync ggml-alloc * ci : see if this is causing the crash * whisper : refactor ggml-alloc init * whisper.android : try to fix build * whisper : initial Metal version * ci : try to debug vmem issue * metal : decoder works on GPU! * metal : add multi-decoder support * ggml : fix ggml_nbytes (probably temp solution) * metal : run "cross" step on the GPU * whisper : remove ggml_repeat in the encoder * whisper : offload the Encoder to Metal * ggml : use simpler ggml_bytes() implementation * ggml-alloc : try to make CI happy by reducing vram to 128GB * whisper : add whisper_allocr to wrap ggml_allocr * whisper : factor out alloc init in a function * cmake : update to support Metal build * whisper : add <functional> header * objc : fix build (no Metal yet) * ios : add Metal support * swiftui : fix build * metal : speed-up KQ multiplication * metal : sync latest llama.cpp kernels * readme : add Metal info * ios : update submodule * coreml : add code to toggle Core ML config (CPU, ANE, GPU) * bench : fix timings by running a pre-heat * bench : start benching the decoder * whisper : add ggml_mul_mat_pad * bench : fix uninitialized vars * whisper : add comment for disabling mul-mat padding * whisper : add description of ggml_mul_mat_pad * whisper : clean-up ggml_mul_mat_pad * metal : remove the "concurrent" flag * bench : variable n_past * ios : update SPM package	2023-09-15 12:18:18 +03:00
Georgi Gerganov	3fec2119e6	whisper : fix bench regression + fix performance when using CPU BLAS (#1275 ) * whisper : fix bench regression * ggml : use sched_yield when using BLAS + add comment	2023-09-12 13:54:04 +03:00
bobqianic	9b14418863	whisper : faster beam_search sampling via reduced KV cache copies (#1243 ) * Faster `beam_search` sampling Refine the KV cache update logic for more intelligent and efficient updating. * Faster `whisper_sample_token_topk` * Update whisper.cpp * Update whisper.cpp * Update whisper.cpp * Reduce `memory allocation` * Add `pointer swapping` * Fixed some bugs * Update whisper.cpp * Apply suggestions from code review * Updated the logic for determining `two-copy` * Updated the logic for determining `two-copy` v2 * whisper : add debug logs + coding style --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-09-10 16:04:27 +03:00
Nicholas Albion	6ddc727fac	java : fixed signing of java artifact using gradle (#1267 ) * --stacktrace signMavenJavaPublication * added temporary step "Debug gradle signing" * cd bindings/java * use GPG_PRIVATE_KEY and GPG_PASSPHRASE * use secrets.GPG_PRIVATE_KEY and GPG_PASSPHRASE	2023-09-09 18:55:51 +03:00
Georgi Gerganov	acb5278cc8	ci : try to fix gradle action (#1265 )	2023-09-08 20:50:15 +03:00
Georgi Gerganov	0839209cab	gitignore : update	2023-09-08 19:45:28 +03:00
Georgi Gerganov	b39809668a	sync : ggml (HBM + Metal + style) (#1264 )	2023-09-08 17:58:31 +03:00
Georgi Gerganov	3e9edc6845	ci : upgrade gradle to 2.4.2 (#1263 ) * ci : upgrade gradle to 2.4.2 * cmake : add comment (#1129)	2023-09-08 17:58:14 +03:00
Georgi Gerganov	bfc73f1fa2	sync : ggml (CUDA faster rope)	2023-09-08 15:01:26 +03:00
Georgi Gerganov	f00c9bba33	cmake : noramlize case (#1129 )	2023-09-08 14:50:03 +03:00

1 2 3 4 5 ...

754 Commits