whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2024-12-19 12:47:52 +00:00

Author	SHA1	Message	Date
Georgi Gerganov	794b162a46	whisper : add integer quantization support (#540 ) * whisper : add integer quantization support * examples : add common-ggml + prepare to add "quantize" tool * whisper : quantization tool ready * whisper : fix F32 support * whisper : try to fix shared lib linkage * wasm : update quantized models to Q5 * bench.wasm : remove "medium" button * bench.wasm : fix custom model button * ggml : add Q5_0 and Q5_1 WASM SIMD * wasm : add quantized models to all WASM examples * wasm : bump DB version number to 2 * talk-llama : update example to latest llama.cpp * node : increase test timeout to 10s * readme : add information for model quantization * wasm : add links to other examples	2023-04-30 18:51:57 +03:00
Georgi Gerganov	8de452c18b	Improve decoding (#291 ) * whisper : prepare infra for new decoding strategies * whisper : apply logit filters and compute logprobs * whisper : add whisper_get_logits() * whisper : separate self and cross attention memory Initial step needed for supporting parallel decoders * whisper : move probs_id buffer to whisper_context * whisper : refactor kv cache into separate struct * whisper : move self-attention kv cache to whisper_decoder * whisper : wip decoding parameters + strategies * whisper : wip decoding parameters + strategies (part 2) * whisper : wip decoding parameters + strategies (part 3) * whisper : wip decoding parameters + strategies (part 4) * whisper : fix prompt_past update to not include prompt_init * whisper : temperature + best_of support * whisper : support for compression_ration_threshold We actually use entropy, but it is similar * command : fix example to use logits instead of obsolete probs * whisper : handle empty sequence ranking * whisper : add WHISPER_DEBUG + diagnostic prints + new main args * whisper : minor fixes * whisper : add beam-search support * whisper : bug fix when there no previous context * whisper : add comments * stream : disable temperature fallback For real-time processing, we always want a single decoder running at T=0 * whisper.swiftui : update example - fix paths + add empty folders	2023-01-15 11:29:57 +02:00
Syahmi Azhar	1512545149	whisper : add loader class to allow loading from buffer and others (#353 ) * whisper : add loader to allow loading from other than file * whisper : rename whisper_init to whisper_init_from_file * whisper : add whisper_init_from_buffer * android : Delete local.properties * android : load models directly from assets * whisper : adding <stddef.h> needed for size_t + code style Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-01-08 13:03:33 +02:00
Andy Maloney	84c6b42e65	cmake : update to 3.19 (#351 ) - update from 3.0 (from 2014) to 3.19 (from 2020) - move some global setting onto the targets (through a cmake include)	2023-01-05 21:22:48 +02:00
Georgi Gerganov	68ecadbbc9	command.wasm : add voice assistant example for the Web (#171 ) Same as the command-line tool "command", but runs in the browser Also, added helper script "extra/deploy-wasm.sh" and fixed some timing constants for the WASM examples.	2022-11-26 11:40:06 +02:00
Georgi Gerganov	cb70b07db5	livestream.sh : simple tool to transcribe audio livestreams (#185 )	2022-11-26 10:05:37 +02:00
Georgi Gerganov	3c390ffe38	stream.wasm : add web-based real-time transcription (#112 )	2022-11-25 23:57:46 +02:00

7 Commits