whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-06-11 19:51:33 +00:00

Author	SHA1	Message	Date
st-gr	eb23f4ef16	openvino : fix convert-whisper-to-openvino.py (#1890 ) Fix issue: Conversion from Whisper to OpenVino failed #1870 convert-whisper-to-openvino.py stopped working with OpenVINO version 2023.0.0-10926-b4452d56304-releases/2023/0 . Error was: TypeError: load(): incompatible function arguments. The following argument types are supported: 1. (self: openvino._pyopenvino.FrontEnd, path: object) -> ov::frontend::InputModel Tested successfully with a large-v3 conversion. Co-authored-by: Stefan Grundmann <grundmanns@sandiego.gov>	2024-02-22 15:11:35 +02:00
Georgi Gerganov	3d42463845	models : add update py requirements	2024-02-13 11:51:32 +02:00
Michael Rienstra	4bbb60efce	docs : make model options / model install methods clearer (#1806 ) * Make models more "discoverable" * Clean up code block language identifiers * make 3 options clearer * undo Prettier formatter change * docs: `$` shell prompt, consistently * docs: minor changes	2024-01-26 17:39:54 +02:00
Sơn Phan Trung	d05b7ee90e	models : make all scripts to be POSIX Compliant (#1725 ) * download-coreml-model: make it POSIX-compliant * download-ggml-model: posix compliant (2nd) * minor edit * forgot to add newline * generate-coreml-interface: far more straightforward * generate-coreml-model: done with the posix thingy * typo * Update download-ggml-model.sh * fix * fix typo * another fix * Update download-coreml-model.sh * Update download-ggml-model.sh * Update download-coreml-model.sh	2024-01-12 14:11:04 +02:00
Yajing Tang	ba5bcde874	coreml : fix ANE optimized encoder (#1716 )	2024-01-04 16:28:30 +02:00
Dimo	a5cc3dc8a2	download : fix large q5 model name (#1695 ) fixed typo in large-v3-q5-0 model name to match HF link	2023-12-29 11:14:32 +02:00
Chaoqun	d2ee117a0a	docker : Dockerize whisper.cpp (#1674 ) * build: add dockerfile for ci * ci: add action to build/push docker image * fix: lowercase repository to fix ci * ci: update cuBLAS flag * build: install curl and ffmped in image * docs: add docker section * fix: improve args check when download model	2023-12-22 11:16:02 +00:00
Georgi Gerganov	c7606b47df	models : add info about distilled models	2023-11-15 21:10:13 +02:00
Georgi Gerganov	bfbaa4dce5	whisper : make large version explicit + fix data size units (#1493 )	2023-11-15 19:42:25 +02:00
bobqianic	953419c69a	openvino : update convert-whisper-to-openvino.py to support v3 (#1459 )	2023-11-09 12:42:39 +02:00
Xiao-Yong Jin	0de8582f65	coreml : use the correct `n_mel` value (#1458 )	2023-11-08 20:01:41 +00:00
Georgi Gerganov	2cdfc4e025	whisper : add support for large v3 (#1444 ) * whisper : add support for large v3 * bench : fix build + fix go bindings * bench : fix n_mels * models : update readme	2023-11-07 15:30:18 +02:00
bobqianic	8a2bee6717	models : use absolute paths for the converted model (#1356 )	2023-11-03 10:44:27 +02:00
WhiteOlivierus	45c87b5481	models : Faster download for models on windows using BitTransfer (#1404 )	2023-10-30 19:18:12 +00:00
Xiang (Kevin) Li	91c0b23384	models : add conversion scripts from HuggingFace models to CoreML (#1304 )	2023-10-04 12:00:25 +03:00
Neil Chudleigh	aed5d40607	models : add quantum models to download-ggml-model.sh (#1235 ) * Add quantized models to download-ggml-model.sh * Update names in download-ggml-model script to normalized	2023-09-07 12:16:58 +03:00
Ryan Metcalfe	62b81276e0	whisper : add OpenVINO support (#1037 ) * openvino: use OpenVINO encoder inference * openvino: add python script for OpenVINO model generation * whisper: Fix 'unused' warnings when OpenVINO isn't enabled in build * Apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * whisper: Fix compilation error * whisper: revert whisper_get_openvino_path_encoder & whisper_get_openvino_path_cache to non-const func signatures * cmake: Add openvino-encoder as separate object target * whisper : minor style fixes * minor : indentation fixes --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-07-04 15:56:11 +03:00
Akash Mahajan	c8d0f5fe98	whisper : support speaker segmentation (local diarization) of mono audio via tinydiarize (#1058 ) * add HuggingFace mirror to download ggml model * support tdrz via simple hack overriding solm tokens * fix incorrect translate/transcribe token_ids that are not static const * add apollo 13 sample for tdrz demo * render [SPEAKER TURN] consistently in all terminal output using vocab.id_to_token * extend whisper_segment with speaker_turn_next field and save in json output * fix failing go build * slipped in some python syntax whoops * whisper : finalize tinydiarize support (add flag + fixes) * whisper : tdrz support for word-level timestamps (respect max_len) * java : try to fix tests after adding tdrz_enable flag * main : remove TODO leftover * java : fix params order list after adding "tdrz_enable" * whisper : fix solm and add nosp token * main : print tinydiarize help --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-07-04 09:45:00 +03:00
Simon Moisselin	6c68218e3c	models : add ggml_to_pt script (#1042 ) * adding ggml_to_pt * typo sys too many args * fixing swap errors dimensions --------- Co-authored-by: simonMoisselin <simon.moisselin@gmail.com>	2023-06-25 15:29:54 +03:00
Roddur Dasgupta	f11f33f1c0	models : cd statements are quoted to allow spaces in path (#1041 )	2023-06-25 15:27:28 +03:00
Georgi Gerganov	8ac23c9f77	models : handle paths with spaces in download script (close #1038 )	2023-06-25 15:23:23 +03:00
Akash Mahajan	3ec7bfffe0	py : make convert-pt-to-ggml.py backwards compatible with older vocab.json tokenizer files (#1001 ) * patch checkpoint convert script to keep compatibility with older hf_transformers whisper tokenizer * typo fix	2023-06-25 13:50:14 +03:00
genevera (she/her)	9b926844e3	models : fix README.md (#964 ) Fixes typo on line 76 of models/README.md	2023-05-27 10:40:28 +03:00
Ahmad Bilal	95b02d76b0	coreml : add support of large-v1 model (#926 )	2023-05-15 18:36:06 +03:00
Clifford Heath	9931d66400	readme : add instructions on converting to GGML + "--no-config" to wget (#874 )	2023-05-08 20:58:36 +03:00
AsukaMinato	94aa56f19e	minor : improve C++ and Python style (#768 ) * use some STL functions * use self.field than setattr, use pathlib.Path * recover some format * const some iter * Keep the original * 2 space	2023-04-29 10:06:25 +03:00
Georgi Gerganov	5e47e223bd	whisper : add Core ML support (#566 ) * coreml : use Core ML encoder inference * coreml : simlpify whisper_encode + log messages * whisper : resolve rebase conflicts * coreml : add scripts for CoreML model generation * bench-all : recognize COREML flag	2023-04-15 13:21:27 +03:00
Ivan Gorin	62b51c3070	models : change convert-pt-to-ggml to use .tiktoken tokenizer files (#725 )	2023-04-14 19:50:39 +03:00
be-next	18e6fb0287	models : handle spaces and special characters in shell script paths (#677 ) This commit modifies the `get_script_path` function to correctly handle spaces and special characters in directory paths. The fix involves adding double quotes around variables and commands where needed to ensure proper parsing of paths with spaces and special characters.	2023-03-29 23:38:33 +03:00
Kamilake	992aa2cd1b	models : change default encoding to utf8 (#605 )	2023-03-22 21:17:24 +02:00
Georgi Gerganov	1beff6f66d	models : change HF hosting from dataset to model	2023-03-22 20:44:56 +02:00
Georgi Gerganov	d629c034a4	models : fix HF model URL (close #356 )	2023-01-02 09:54:43 +02:00
Ikko Ashimine	3467230a77	models : fix typo in convert-h5-to-ggml.py signficant -> significant	2022-12-31 09:49:01 +02:00
Georgi Gerganov	77226aa89d	models : fix support for spaces in path (close #315 )	2022-12-23 11:11:38 +02:00
Georgi Gerganov	a613f16aec	talk : improve prompting	2022-12-12 23:44:36 +02:00
Kartik Saranathan	d91c001120	Fix paths echoed after the download Was using models path instead of root path	2022-12-08 09:23:52 +02:00
Georgi Gerganov	9fe7306f4b	models : add the new "large" model release by OpenAI The old "large" model is now renamed "large-v1". If you have been using it, make sure to rename it and download the new "large" model for best results.	2022-12-06 18:48:57 +02:00
Georgi Gerganov	abce28ea99	talk.wasm : move to https://whisper.ggerganov.com/talk This way, we can share the same models across different WASM examples and not have to download them for each page	2022-11-24 18:24:06 +02:00
Georgi Gerganov	a2ecd54455	models : add instructions for using HF fine-tuned models	2022-11-24 17:54:41 +02:00
Georgi Gerganov	00f46dbc1d	models : add usage comments to the HF convert script (#157 )	2022-11-23 23:22:40 +02:00
Georgi Gerganov	5698bddbc9	models : fix HF fine-tuned model conversion script (#157 ) It works now	2022-11-23 23:14:11 +02:00
Georgi Gerganov	d64d6ca3fd	models : minor changes to the HF convert script (#157 )	2022-11-23 22:07:20 +02:00
Georgi Gerganov	93482d0373	models : add "convert-h5-to-ggml.py" script (#157 ) Converts transformers models to ggml. Although the conversion is successful, it does not work for some reason. Not sure why	2022-11-23 17:19:22 +02:00
Georgi Gerganov	e70e5c8b53	models : simplify the conversion script "transformers" dependency is not actually needed	2022-11-16 19:22:32 +02:00
Dody Suria Wijaya	55a0e1a64e	Update download-ggml-model.sh follow curl redirect to new hosting site	2022-11-16 18:59:44 +02:00
Georgi Gerganov	864a78a8d0	models : change default hosting to Hugging Face My Linode is running out of monthly bandwidth due to the big interest in the project	2022-11-15 19:47:06 +02:00
Georgi Gerganov	46a68fb9b5	minor : remove one more redundant line	2022-11-11 18:02:58 +02:00
Georgi Gerganov	ccd56a9c5b	minor : fix double float32 conversion in python script	2022-11-11 17:58:51 +02:00
Georgi Gerganov	b5dde365e9	extra : compute SHA of all models files	2022-11-02 18:31:55 +02:00
Mikhail Grigorev	b26345cc7b	Added for Windows implemenated script download-ggml-model.cmd	2022-10-31 19:38:20 +02:00

1 2

58 Commits