whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-06-01 06:50:41 +00:00

Author	SHA1	Message	Date
katsu560	800ae5b808	fix AVX,AVX2,FMA,F16C detection on Linux and add flags for OpenBLAS	2022-11-23 22:16:33 +02:00
katsu560	83456076f0	add AVX support	2022-11-23 22:16:33 +02:00
Tamotsu Takahashi	3df6c14fca	Build with OpenBLAS and SDL2 on windows	2022-11-23 22:09:54 +02:00
Georgi Gerganov	d64d6ca3fd	models : minor changes to the HF convert script (#157 )	2022-11-23 22:07:20 +02:00
Georgi Gerganov	93482d0373	models : add "convert-h5-to-ggml.py" script (#157 ) Converts transformers models to ggml. Although the conversion is successful, it does not work for some reason. Not sure why	2022-11-23 17:19:22 +02:00
Georgi Gerganov	49706a658a	minor : updates few prints + fix buttons in whisper.wasm	2022-11-23 17:19:21 +02:00
Georgi Gerganov	363a2dadec	Update README.md	2022-11-23 09:53:55 +02:00
Georgi Gerganov	623a486056	Update README.md	2022-11-23 09:52:36 +02:00
Tamotsu Takahashi	2f596f5b33	Find libopenblas.dll.a on windows "lib" is needed for windows. With this change, you can build whisper.cpp with OpenBLAS's prebuilt DLL. 1. extract a zip from https://github.com/xianyi/OpenBLAS/releases 2. copy the headers in (openblas)/include to the root directory of whisper.cpp 3. invoke cmake with -DCMAKE_LIBRARY_PATH=(openblas)\lib -DWHISPER_SUPPORT_OPENBLAS=ON 4. copy (openblas)/bin/libopenblas.dll to the same directory of whisper.dll after msbuild https://github.com/ggerganov/whisper.cpp/issues/89#issuecomment-1324391258	2022-11-23 08:26:45 +02:00
Georgi Gerganov	e5dcdabbb8	unicode : fix character replacement (thanks to @tamo)	2022-11-23 08:24:29 +02:00
Georgi Gerganov	dad109c3f1	close #109 : add fetching of the model over HTTP (whisper.wasm)	2022-11-22 22:48:56 +02:00
Georgi Gerganov	326573de9a	talk.wasm : final touches	2022-11-22 22:22:17 +02:00
Georgi Gerganov	9aea96f774	talk.wasm : polishing + adding many AI personalities	2022-11-22 20:10:20 +02:00
Georgi Gerganov	385236d1d3	stream : "-kc" now enables context keeping from previous segment (#90 ) By default, the context keeping is disabled	2022-11-22 18:21:15 +02:00
M. Eren Akbiyik	63ae03b8e0	Prompt previous tokens for streaming (#163 ) * feat: prompt previous tokens for streaming I used a vector pointer instead of vector itself because it gave weird errors, and why not * convert vector to use with C api * feat: remove old refs, check for prompt size * feat: use better way of getting the pointer	2022-11-22 18:10:35 +02:00
Georgi Gerganov	78116f8eda	talk.wasm : update README.md	2022-11-21 22:42:29 +02:00
Georgi Gerganov	a4dfbeecf9	talk.wasm : GPT-2 meets Whisper in WebAssembly (#155 ) * talk : initial real-time transcription in the browser * talk : polishing the UI * talk : ready for beta testing * talk.wasm : rename example	2022-11-21 22:20:42 +02:00
Georgi Gerganov	2e311a2917	Update README.md	2022-11-21 18:52:20 +02:00
Georgi Gerganov	2065572a11	ggml : fix Windows build	2022-11-20 22:47:03 +02:00
Georgi Gerganov	5c2176e314	ci : add Windows build	2022-11-20 22:47:03 +02:00
Georgi Gerganov	f2df9bd768	stream : add "max_tokens" cli arg Controls the max tokens per segment for the stream example	2022-11-20 21:22:41 +02:00
Georgi Gerganov	fb8d77f760	stream : add "audio_ctx" parameter Used to overwrite the audio context size of the Encoder. For example, setting "audio_ctx = 512" will make it run about 3 times faster, processing about 10s of audio, instead of 30s. The transcription quality drops, but this can be used for real-time streaming purposes where performance is important.	2022-11-20 21:22:41 +02:00
Georgi Gerganov	62b5ff875c	stream : add "max_tokens" parameter Used to limit the number of tokens in a segment. Useful to battle with word repetition when using partial encoder context	2022-11-20 21:22:41 +02:00
Georgi Gerganov	d351771a4b	stream : add "single_segment" option Force the entire audio chunk to be transcribed into a single segment	2022-11-20 21:22:41 +02:00
Georgi Gerganov	c058aaf22e	stream : partial encoder experiments	2022-11-20 21:22:41 +02:00
greeshmay	2ba66360c9	fix: free ggml_context (close #149 ) (#150 ) * fix: free ggml_context * ggml : free the model's contexts in whisper_free() Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2022-11-17 22:12:51 +02:00
Georgi Gerganov	e70e5c8b53	models : simplify the conversion script "transformers" dependency is not actually needed	2022-11-16 19:22:32 +02:00
Dody Suria Wijaya	55a0e1a64e	Update download-ggml-model.sh follow curl redirect to new hosting site	2022-11-16 18:59:44 +02:00
Georgi Gerganov	864a78a8d0	models : change default hosting to Hugging Face My Linode is running out of monthly bandwidth due to the big interest in the project	2022-11-15 19:47:06 +02:00
Georgi Gerganov	83c742f1a7	whisper : add option to speed up the audio tempo by x2 Using a Phase Vocoder for speeding up the audio tempo by scaling down the frequencies in the frequency domain. This reduces the computation in the Encoder by a factor of 2. The transcription accuracy is degraded, but for slow to normal speech - it seems to be still very good. I think this can find application for real-time transcription - i.e. the "stream" example.	2022-11-13 16:25:43 +02:00
Georgi Gerganov	41b48ab7f1	make : add libwhisper.so target (#144 )	2022-11-13 09:09:48 +02:00
Chidi Williams	a728be9cdb	Add WHISPER_NO_AVX and WHISPER_NO_AVX2 to CMakeLists (#136 ) * Check for AVX and AVX2 on Darwin * Add AVX options to CMakeLists	2022-11-11 18:10:01 +02:00
Georgi Gerganov	46a68fb9b5	minor : remove one more redundant line	2022-11-11 18:02:58 +02:00
Georgi Gerganov	ccd56a9c5b	minor : fix double float32 conversion in python script	2022-11-11 17:58:51 +02:00
Georgi Gerganov	3500ce8727	ref #40 : start working on the documentation	2022-11-09 21:41:40 +02:00
Alan	7519eabf65	Adds support for stdin wav input	2022-11-09 20:37:23 +02:00
Georgi Gerganov	b21213c23e	js : update whipser.js to latest	2022-11-09 19:33:10 +02:00
Chidi Williams	9e700e1821	Check for AVX and AVX2 on Darwin	2022-11-09 18:49:55 +02:00
boolemancer	0bfe728b84	Fix the Windows pthread_create shim The current implementation doesn't actually set the out parameter, and it returns 0 on failure instead of on success.	2022-11-08 15:02:32 +02:00
Georgi Gerganov	4e5674a5d5	sync : submodule whisper.spm	2022-11-07 21:48:13 +02:00
Georgi Gerganov	4c66b6a828	cmake : add submodule whisper.spm	2022-11-07 20:50:24 +02:00
Georgi Gerganov	c30bffc8a5	ref #22 : add "duration" option Can be used to partially process a recording	2022-11-07 20:14:52 +02:00
Georgi Gerganov	8fdfb0ba92	Update README.md	2022-11-06 21:04:21 +02:00
Georgi Gerganov	c71363f14c	examples : add simple script for generating Karaoke video	2022-11-06 09:22:50 +02:00
Georgi Gerganov	a09e9123ca	Update README.md	2022-11-05 08:44:41 +02:00
Georgi Gerganov	d42cf6d0df	Update README.md	2022-11-04 22:26:08 +02:00
Georgi Gerganov	ef47d77492	main : fix generated bash script	2022-11-04 18:30:38 +02:00
Georgi Gerganov	75171c2b79	ggml : multi-thread the ggml_add operator	2022-11-03 20:53:44 +02:00
Georgi Gerganov	a2eeb941f6	cmake : fix passing GGML_PERF compile option	2022-11-03 20:19:06 +02:00
Georgi Gerganov	0e689f83d8	Update README.md	2022-11-02 22:03:27 +02:00

1 2 3 4 5

230 Commits