whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-06-06 01:01:32 +00:00

Author	SHA1	Message	Date
Georgi Gerganov	551529290d	talk-llama : sync llama.cpp	2024-02-12 10:39:58 +02:00
dscripka	a6fb6ab597	examples : added audio_ctx argument to main and server (#1857 ) * added audio_ctx argument to main and server examples * Better default value Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * better default value (again) Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-02-12 09:19:07 +02:00
Georgi Gerganov	f273e66dc6	examples : initialize context params properly (#1852 )	2024-02-11 16:39:12 +02:00
Georgi Gerganov	02b4c52c12	talk-llama : sync llama.cpp	2024-02-10 10:10:59 +02:00
Valentin Gosu	80e8a2ea39	server : allow CORS request with authorization headers (#1850 ) Whisper plugin in Obsidian requires an API key which is then sent as an authorization header. However, the presence of an authorization header requires a CORS Preflight, so both the OPTIONS method and the Access-Control-Allow-Headers: authorization must be handled.	2024-02-09 17:42:41 +02:00
Neuman Vong	19f8048139	whisper.android : how to build with CLBlast (#1809 ) * FetchContent * OpenCL * Documentation and make optional * Specify GGML build options in build.gradle * Use gradle properties * @ggerganov Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * @gpokat --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-02-09 17:39:05 +02:00
Georgi Gerganov	434b8f3b96	talk-llama : stream response (#1121 )	2024-02-06 19:56:12 +02:00
Georgi Gerganov	7a74e929c8	sync : ggml (#0 )	2024-01-30 21:30:26 +02:00
JacobLinCool	ae5c4f7340	common : fix wav buffer detection (#1819 )	2024-01-30 19:35:08 +02:00
JacobLinCool	baa30bacdb	server : add fields to `verbose_json` response (#1802 ) * server: include additional fields in the verbose_json response as OpenAI does * server: show request examples on home page * server: todo note for compression_ratio and no_speech_prob * server: add simple demo form to the homepage	2024-01-30 14:15:55 +02:00
Georgi Gerganov	e72e4158de	talk-llama : sync llama.cpp	2024-01-28 19:44:10 +02:00
Georgi Gerganov	52cce82493	common : fix input buffer check (#1812 )	2024-01-27 17:33:09 +02:00
Georgi Gerganov	ef3c9ed9eb	talk-llama : sync llama.cpp	2024-01-27 17:24:53 +02:00
Michael Rienstra	4bbb60efce	docs : make model options / model install methods clearer (#1806 ) * Make models more "discoverable" * Clean up code block language identifiers * make 3 options clearer * undo Prettier formatter change * docs: `$` shell prompt, consistently * docs: minor changes	2024-01-26 17:39:54 +02:00
Neuman Vong	d6b9be21d7	whisper.android : return output from benchmarks (#1785 ) Benchmarks are failing because JNI expects a jstring and the benchmarks are missing a return statement (i.e., returning null). The functions actually build a jstring but don't return it, so this seems to have been an oversight. This patch returns the jstring and now the benchmarks run successfully. Fixes #1783.	2024-01-19 16:17:38 +02:00
Ryan Hitchman	c0329acde8	server : implement "verbose_json" format with token details (#1781 ) * examples/server: implement "verbose_json" format with token details. This is intended to mirror the format of openai's Python whisper.transcribe() return values. * server: don't write WAV to a temporary file if not converting * server: use std::lock_guard instead of manual lock/unlock	2024-01-18 22:58:42 +02:00
Georgi Gerganov	1f50a7d29f	sync : llama.cpp	2024-01-17 21:23:33 +02:00
Benjamin Heiniger	f6614155e4	talk-llama : optional wake-up command and audio confirmation (#1765 ) * talk-llama: add optional wake-word detection from command * talk-llama: add optional audio confirmation before generating answer * talk-llama: fix small formatting issue in output * talk-llama.cpp: fix Windows build	2024-01-16 15:52:01 +02:00
Przemysław Pawełczyk	f5f159c320	server : fix building and simplify lib deps on Windows (#1772 ) * make : fix server example building on MSYS2 environments (Windows) It was not working since commit eff3570f78742dfd56024328ed93d4f442434280 when server was introduced. * cmake : simplify server example lib deps on Windows server uses httplib::Server, not httplib::SSLServer, so there is no need to mention cryptographic libraries in target_link_libraries. Winsock (ws2_32) suffices here. Also use plain library names like we use in other places.	2024-01-15 15:48:13 +02:00
Georgi Gerganov	6ebba525f1	talk-llama : sync llama.cpp	2024-01-14 18:08:20 +02:00
Georgi Gerganov	2a5874441d	talk-llama : llama.cpp	2024-01-14 11:06:28 +02:00
Georgi Gerganov	d08445c9ad	sync : ggml	2024-01-14 10:55:18 +02:00
Georgi Gerganov	f001a3b7b6	talk-llama : sync llama.cpp	2024-01-14 00:13:17 +02:00
RhinoDevel	db078a9ba8	talk-llama : add optional CLI arg to set the bot name (#1764 )	2024-01-13 20:51:35 +02:00
james wolf	a13a7da5ad	examples : add python example for transcription (#1744 ) * rebase and add simple python interface * moved python files to examples/python	2024-01-13 19:37:18 +02:00
Georgi Gerganov	40ae0962f4	talk-llama : sync llama.cpp	2024-01-12 22:04:51 +02:00
George Hindle	fbcb52d3cd	server : add more parameters to server api (#1754 ) * feat(server): add more parameters to server api * fix(server): reset params to original parsed values for each request	2024-01-12 13:42:52 +02:00
George Hindle	f7908f9bb8	params : don't compute timestamps when not printing them (#1755 )	2024-01-12 13:24:38 +02:00
Georgi Gerganov	00b7a4be02	talk-llama : sync llama.cpp	2024-01-11 22:10:10 +02:00
Georgi Gerganov	32e71a1861	sync : ggml	2024-01-11 21:54:17 +02:00
Georgi Gerganov	9c857cf280	sync : llama.cpp	2024-01-11 21:50:01 +02:00
RhinoDevel	bcc1658cd0	talk-llama : add optional Piper TTS support (#1749 ) Add commented-out command to optionally use Piper (https://github.com/rhasspy/piper) as text-to-speech solution for the talk-llama example. Piper voices sound almost like real people which is a big improvement (e.g.) from something like espeak.	2024-01-10 16:15:28 +02:00
Emmanuel Schmidbauer	c46886f599	server : add request path option(#1741 )	2024-01-08 22:39:51 +00:00
Georgi Gerganov	29f78392c1	main : add cli option to disable system prints (#1740 )	2024-01-08 16:41:28 +02:00
Georgi Gerganov	022756a872	server : fix server temperature + add temperature_inc (#1729 ) * server : fix server temperature + add temperature_inc * server : change dashes to underscores in parameter names	2024-01-07 13:35:14 +02:00
Georgi Gerganov	3b8c2dff57	talk-llama : sync latest llama.cpp	2024-01-06 17:22:57 +02:00
Georgi Gerganov	ab0a8593c5	whisper.swiftui : add .gitignore	2024-01-04 15:00:27 +02:00
Tamotsu Takahashi	d87de61ae6	ci : build with CLBlast + ggml-opencl use GGML_API (#1576 ) * Build with CLBlast * Declare GGML_API After rebasing, examples/talk-llama failed: "D:\a\whisper.cpp\whisper.cpp\build\ALL_BUILD.vcxproj" (build target) (1) -> "D:\a\whisper.cpp\whisper.cpp\build\examples\talk-llama\talk-llama.vcxproj" (default target) (14) -> (Link target) -> llama.obj : error LNK2019: unresolved external symbol ggml_cl_free_data referenced in function "public: __cdecl llama_model::~llama_model(void)" (??1llama_model@@QEAA@XZ) [D:\a\whisper.cpp\whisper.cpp\build\examples\talk-llama\talk-llama.vcxproj] llama.obj : error LNK2019: unresolved external symbol ggml_cl_transform_tensor referenced in function "public: void __cdecl llama_model_loader::load_all_data(struct ggml_context ,void (__cdecl)(float,void ),void ,struct llama_mlock *)" (?load_all_data@llama_model_loader@@QEAAXPEAUggml_context@@P6AXMPEAX@Z1PEAUllama_mlock@@@Z) [D:\a\whisper.cpp\whisper.cpp\build\examples\talk-llama\talk-llama.vcxproj] D:\a\whisper.cpp\whisper.cpp\build\bin\Release\talk-llama.exe : fatal error LNK1120: 2 unresolved externals [D:\a\whisper.cpp\whisper.cpp\build\examples\talk-llama\talk-llama.vcxproj]	2023-12-29 12:23:27 +02:00
Georgi Gerganov	3a5302108d	sync : ggml (ggml_scale, ggml_row_size, etc.) (#1677 ) * sync : ggml * sync : llama.cpp * talk-llama : fix obsolete param * ggml-alloc : fix ggml_tallocr_is_own * talk.wasm : update to new ggml * ggml : fix type punning in ggml_scale * ggml : cuda jetson + arm quants warnings	2023-12-22 17:53:39 +02:00
bobqianic	d2419030b0	examples : Revert CMakeLists.txt for talk-llama (#1669 )	2023-12-21 22:48:52 +02:00
Georgi Gerganov	940de9dbe9	wchess : update README.md	2023-12-14 22:00:47 +02:00
Georgi Gerganov	375585c07c	wchess : update readme	2023-12-14 17:51:14 +02:00
fraxy-v	fd99ece8e3	wchess : whisper assisted chess (#1595 ) * wchess: whisper assisted chess * wchess: fix allowed moves in check * wchess: touchstart, touchend events * wchess: css, disabled button * wchess : html touches * wchess : minor fixes and code style * wchess : bump encoder context to 1280 * wchess : index.html * wchess : fix CI warnings * wchess : add array header * wchess : build static library * wchess : display grammar * wchess : update UX * wchess : add comment * wchess : add README --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-12-14 15:58:26 +02:00
Kreijstal	ec03661b20	cmake : target windows 8 or above for prefetchVirtualMemory in llama-talk (#1617 ) Since we use prefetchVirtualMemory we specify we target win 8 or above, otherwise other compilers will refuse to use the prefetchVirtualMemory api, (I understand you are loading it dynamically but the header definition has this limitation)	2023-12-12 11:35:00 +00:00
Kreijstal	6335933a5b	cmake : Fix bug in httplib.h for mingw (#1615 ) Fix bug in httlib.h for mingw, please see https://github.com/yhirose/cpp-httplib/issues/1669	2023-12-10 17:47:52 +00:00
Georgi Gerganov	9521ba6801	whisper.objc : disable timestamps for real-time transcription	2023-12-08 13:43:37 +02:00
Oleg Sidorov	3163090d89	server : pass max-len argument to the server (#1574 ) This commit fixes the missing parameter binding for max-len between the input arguments and wparams.	2023-12-05 23:01:45 +02:00
Aleksander Andrzejewski	a0ec3fac54	Server : Add support for .vtt format to Whisper server (#1578 ) - The code comes from examples/main - The output mimetype is set to text/vtt Example usage: ```shell curl 127.0.0.1:8080/inference \ -H "Content-Type: multipart/form-data" \ -F file="@samples/jfk.wav" \ -F temperature="0.2" \ -F response-format="vtt" ```	2023-11-30 23:44:26 +00:00
Oleg Sidorov	6559b538e5	server : backport .srt output format (#1565 ) This commit adds a support of .srt format to Whisper server. The code is effectively backported from examples/main. The output mimetype is set to application/x-subrip as per https://en.wikipedia.org/wiki/SubRip. Example usage: curl 127.0.0.1:8080/inference \ -H "Content-Type: multipart/form-data" \ -F file="@<file-path>" \ -F temperature="0.2" \ -F response-format="srt"	2023-11-28 15:42:58 +02:00
Kasumi	6b094b6dfe	server : set default CORS headers to allow all (#1567 )	2023-11-28 11:55:20 +02:00

1 2 3 4 5 ...

322 Commits