whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2024-12-25 07:01:04 +00:00

Author	SHA1	Message	Date
Georgi Gerganov	0a55a70b9b	make : fix missing -O3 same as https://github.com/ggerganov/llama.cpp/pull/8143	2024-06-26 21:21:12 +03:00
Georgi Gerganov	e30c679928	whisper : reorganize source code + improve CMake (#2256 ) * scripts : update sync [no ci] * files : reorganize [no ci] * sync : llama.cpp * cmake : link math library * cmake : build normal ggml library * files : move headers to include * objc : fix path to ggml-metal.h * ci : fix WHISPER_CUDA -> GGML_CUDA * scripts : sync LICENSE [no ci]	2024-06-26 19:34:09 +03:00
Georgi Gerganov	5181494e9f	build : update make / cmake	2024-06-18 09:39:40 +03:00
Georgi Gerganov	3b1ac03828	ggml : remove OpenCL (#0 )	2024-06-16 18:19:48 +03:00
Georgi Gerganov	6975600b4b	cuda : enable CUDA graphs (#0 )	2024-06-16 18:19:48 +03:00
Georgi Gerganov	4942b1b428	cmake : fix CUDA build (#0 )	2024-06-16 18:19:48 +03:00
Georgi Gerganov	420b6abc54	cuda : fix HIPBLAS build (#2234 )	2024-06-11 19:14:38 +03:00
Borislav Stanimirov	ffef323c4c	whisper : add CUDA-specific computation mel spectrograms (#2206 ) * whisper : use polymorphic class to calculate mel spectrogram * whisper : add cuda-specific mel spectrogram calculation * whisper : conditionally compile cufftGetErrorString to avoid warnings * build : add new files to makefile * ruby : add new files to conf script * build : fix typo in makefile * whisper : suppress cub warning for deprecated C++ std in whisper-mel-cuda	2024-06-04 09:32:23 +03:00
Przemysław Pawełczyk	b6680fab50	build : improve disabling AVX-512 (#2129 ) * cmake : make WHISPER_NO_AVX512=ON disable all subsets of AVX-512 Previously it happened only for MSVC, but it makes sense to have the same behavior for other compilers too. * make : reorder x86 ISA extensions in chronological order And update compiler flags at the end to ease modifying conditions. * make : support WHISPER_NO_AVX512=1 for disabling all AVX-512 subsets. That way you do not have to override each AVX-512 subset setting individually if it has been turned on during autodetection.	2024-05-08 18:32:43 +03:00
Przemysław Pawełczyk	8fac6455ff	make : change GNU make default CXX from g++ to c++ (#2100 )	2024-04-28 22:54:21 +01:00
Didzis Gosko	08d3eef97d	build : fix embedded Metal library generation (#2045 )	2024-04-15 20:23:05 +03:00
Didzis Gosko	c7f95b7ca2	build : detect AVX512 in Makefile, add AVX512 option in CMake (#2043 ) * make : add AVX512 detection to Makefile and CMakeLists.txt * make : autodetect more AVX512 instruction subsets * cmake : do not default to AVX512, must be enabled explicitly * cmake : enable a set of AVX512 subsets, when AVX512 is turned on * make : consolidate AVX512 subsets, add AVX512 VBMI * cmake : revert to NO AVX512 setting, add settings for AVX512 VNNI and VBMI * make : re-introduce AVX512VNNI back * cmake : remove superfluous comment line	2024-04-15 20:02:09 +03:00
Przemysław Pawełczyk	1e8f28c42a	build : use pkg-config for OpenBLAS (#1778 ) * make : use pkg-config for finding CFLAGS & LDFLAGS needed by OpenBLAS That way building on nix like environments (including MSYS2 on Windows) with WHISPER_OPENBLAS=1 works out of the box. Fix handling of WHISPER_OPENBLAS, so that empty value or 0 won't be misinterpreted by make as enabled. Mind that it's not intended to detect CMake false constants (OFF NO FALSE N). make is not CMake. By default OpenBLAS with 64-bit interface is used, but that can be changed with `WHISPER_OPENBLAS_INTERFACE64=0` if 32-bit one is desired. If OpenBLAS headers and library are respectively in include/ and lib/ subdirectories of given path, then you can specify it, e.g. `OPENBLAS_PATH=/usr/local/openblas`, and this will take precedence over any pkg-config file. If there is no pkg-config file (.pc) for OpenBLAS and OPENBLAS_PATH is empty, then headers are assumed to be in /usr/include/openblas and library as assumed to be called 'openblas64' (or 'openblas' if `WHISPER_OPENBLAS_INTERFACE64=0`). If different headers location should be used, then it can be done, e.g. `WHISPER_BLAS_CFLAGS=-I/usr/local/include/openblas`. If different library should be used, it can be specified, e.g. `WHISPER_BLAS_LIB=openblasp64` (pthreads version as seen on Fedora), or you can provide LDFLAGS needed to link with OpenBLAS directly: `WHISPER_BLAS_LDFLAGS="-L/usr/local/lib/openblas -lopenblas64"`. Current solution is flexible enough to handle most cases out there without needlessly hardcoding possible OpenBLAS installation details. cmake : fix how pkg-config is used for finding include dirs and libraries needed by OpenBLAS That way building on nix like environments (including MSYS2 on Windows) with -DWHISPER_OPENBLAS=ON should work out of the box as long as you have CMake 3.25 or newer. Make OPENBLAS_PATH environment variable supported not only on Windows. It sets OpenBLAS include dir to ${OPENBLAS_PATH}/include and library to ${WHISPER_BLAS_LIB} (name without prefixes and suffixes) in ${OPENBLAS_PATH}/lib and avoids further package finding. By default OpenBLAS with 64-bit interface is used (equivalent to setting `-DWHISPER_BLAS_LIB=openblas64`), but that can be changed with `-DWHISPER_OPENBLAS_INTERFACE64=OFF` (equivalent to setting `-DWHISPER_BLAS_LIB=openblas`) if 32-bit one is desired. Turn on BLA_STATIC for FindBLAS only when WHISPER_STATIC is enabled. BLA_STATIC may not work as expected for pkg-config based operation. Get rid of supporting BLAS_HOME environment variable. If OPENBLAS_PATH is insufficient in your case, there is no pkg-config file to rely on, then you can manually specify include dir, e.g. `-DBLAS_INCLUDE_DIRS=/usr/local/include/openblas`, and library, e.g. `-DBLAS_LIBRARIES=/usr/local/lib/libopenblas.so`. make / cmake : use OpenBLAS with 32-bit interface by default. OpenBLAS w/o INTERFACE64=1 vel USE_64BITINT=1 seems to be more common. * cmake : hardcode "lib" prefix for OpenBLAS lib filename (even on Windows) * cmake : hardcode OpenBLAS library name when building in MSVC (Windows) Most *nix like environments (including MSYS2 on Windows) have OpenBLAS packages that allow coexistence of OpenBLAS builds with 32-bit and 64-bit interface (w/o and w/ OPENBLAS_USE64BITINT defined) and they differ by not having or having "64" suffix in their library filenames. That's not the case for OpenBLAS prebuilt libraries for Windows.	2024-03-29 15:53:26 +02:00
Georgi Gerganov	9fb308d90f	make : add grammar parser to common objects	2024-03-28 11:59:48 +02:00
Georgi Gerganov	2948c740a2	sync : ggml (#2001 ) * sync : update scripts * sync : ggml * talk-llama : sync llama.cpp * make : WHISPER_CUBLAS -> WHISPER_CUDA * ci : try to fix sycl build * talk-llama : fix make build	2024-03-27 18:55:10 +02:00
Georgi Gerganov	de4d067f1e	talk-llama : sync llama.cpp	2024-03-15 14:21:59 +02:00
LBlue	276615d708	make : fix CUBLAS link with WSL (#1878 )	2024-02-20 12:05:38 +02:00
Georgi Gerganov	65faae0b6a	build : update CBLAS flags + fix unused var warning (#0 )	2024-02-19 14:44:46 +02:00
Didzis Gosko	163e74b6c3	metal : option to embed MSL source into compiled binary (#1842 ) * ggml : embed Metal library source (ggml-metal.metal) into binary enable by setting WHISPER_EMBED_METAL_LIBRARY * rename the build option * rename the preprocessor directive * generate Metal library embedding assembly on-fly during build process	2024-02-11 16:41:41 +02:00
Didzis Gosko	0f80e5a80a	whisper : expose CUDA device setting in public API (#1840 ) * Makefile : allow to override CUDA_ARCH_FLAG * whisper : allow to select GPU (CUDA) device from public API	2024-02-09 17:27:47 +02:00
Didzis Gosko	b6559333ff	make : add macOS deployment target option (#1839 )	2024-02-09 17:26:29 +02:00
jwijffels	3e6fad07aa	make : update MSYS_NT (#1813 ) I just upgraded the R wrapper at https://github.com/bnosac/audio.whisper to use whisper.cpp 1.5.4 I'm working on Windows and noticed while doing that that it did not pick up the relevant CFLAGS/CXXFLAGS as my system showed ``` I whisper.cpp build info: I UNAME_S: MSYS_NT-10.0-19045 I UNAME_P: unknown I UNAME_M: x86_64 ``` Many thanks for all the tremendous hard work on maintaining whisper.cpp!	2024-01-30 14:13:49 +02:00
Przemysław Pawełczyk	f5f159c320	server : fix building and simplify lib deps on Windows (#1772 ) * make : fix server example building on MSYS2 environments (Windows) It was not working since commit `eff3570f78` when server was introduced. * cmake : simplify server example lib deps on Windows server uses httplib::Server, not httplib::SSLServer, so there is no need to mention cryptographic libraries in target_link_libraries. Winsock (ws2_32) suffices here. Also use plain library names like we use in other places.	2024-01-15 15:48:13 +02:00
Georgi Gerganov	e77b27c331	sync : ggml (VMM, sync-ggml-am, dotprod ARM fixes, CUDA fixes) (#1691 ) * scripts : add sync-ggml-am.sh * sync : ggml (VMM, ARM dot prod fix, etc.) * build : fix CUDA build * ggml : fix some mul mat cases + add tests for src1 F16 `dbd02958fa`	2023-12-29 11:30:47 +02:00
Felix	eff3570f78	server : add a REST Whisper server example with OAI-like API (#1380 ) * Add first draft of server * Added json support and base funcs for server.cpp * Add more user input via api-request also some clean up * Add reqest params and load post function Also some general clean up * Remove unused function * Add readme * Add exception handlers * Update examples/server/server.cpp * make : add server target * Add magic curl syntax Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-11-20 21:40:24 +02:00
Georgi Gerganov	bfbaa4dce5	whisper : make large version explicit + fix data size units (#1493 )	2023-11-15 19:42:25 +02:00
Evan Jones	3e5c7feeff	whisper : add grammar-based sampling (#1229 ) * whisper : add grammar-based sampling * build : fix after master merge * command : fix exception when recognizing the command * whisper : fine-tuning grammar functionality * command : grammar-related improvements - option to read grammar from file - add sample grammars for colors and chess moves - fine-tune the performance further * grammars : add assistant + update comments * command : enable beam-search, add "no_timestamps", add "context", add p * whisper : remove comment --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-11-13 10:51:34 +02:00
Georgi Gerganov	b0502836b8	whisper : add full CUDA and Metal offloading (#1472 ) * whisper : migrate to ggml-backend * whisper : fix logit reading * whisper : fix tensor allocation during load * whisper : fix beam-search with CUDA * whisper : free backends + fix compile warning * whisper : print when CUDA is enabled * whisper : fix CoreML * make : clean-up * talk : fix compile warning * whisper : support ggml_conv with CUDA and Metal (#1473) * ggml : add CUDA support for ggml_conv * whisper : remove ggml_repeat for conv bias + single backend * cuda : fix im2col kernel * metal : add im2col support + mul mat-vec f16 x f16 * bench-all : add q4 models * whisper : clean-up * quantize-all : fix * ggml : im2col opts * whisper : avoid whisper_model_data wrapper * whisper : add note that ggml_mul_mat_pad does not work with CUDA * whisper : factor out graph compute in common function * whisper : fixes * whisper : fix UB with measure buffers * whisper : try to fix the parallel whisper_state functionality (#1479) * whisper : try to fix the parallel whisper_state functionality * whisper : fix multi-state Metal * whisper : free backend instances in whisper_state	2023-11-12 15:31:08 +02:00
Georgi Gerganov	2cdfc4e025	whisper : add support for large v3 (#1444 ) * whisper : add support for large v3 * bench : fix build + fix go bindings * bench : fix n_mels * models : update readme	2023-11-07 15:30:18 +02:00
Georgi Gerganov	f96e1c5b78	sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422 ) * sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) * metal : allow env metal variable to override resource path (#1415) * Allow env variable to override resource path * Update ggml-metal.m --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * sync : restore common / main from `master` * sync : restore whisper from `master` * talk-llama : update to latest llama.cpp * ruby : fix build * ggml : fix 32-bit ARM build * ggml : fix MIN / MAX macro collisions + update ios bindings * ggml : fix ifdefs and MIN / MAX again * exampels : fix Obj-C and Swift examples * ggml : fix 32-bit ARM compatibility * ggml : one more attempt to fix 32-bit ARM compat * whisper : fix support for larger graphs --------- Co-authored-by: Chris Raethke <codesoda@users.noreply.github.com>	2023-11-03 21:35:05 +02:00
Georgi Gerganov	b8432f28f4	metal : add F32 support + update bench output	2023-09-15 13:56:08 +03:00
Georgi Gerganov	93935980f8	whisper : Metal and ggml-alloc support (#1270 ) * metal : init * whisper : factor out graph builds * whisper : allocate encoder and decoder using ggml-alloc * whisper : ggml-alloc is now supported * whisper : CoreML support ggml-alloc * build : fix ggml-alloc * ios : update submodule * extra : update sync-ggml.sh script to also sync ggml-alloc * ci : see if this is causing the crash * whisper : refactor ggml-alloc init * whisper.android : try to fix build * whisper : initial Metal version * ci : try to debug vmem issue * metal : decoder works on GPU! * metal : add multi-decoder support * ggml : fix ggml_nbytes (probably temp solution) * metal : run "cross" step on the GPU * whisper : remove ggml_repeat in the encoder * whisper : offload the Encoder to Metal * ggml : use simpler ggml_bytes() implementation * ggml-alloc : try to make CI happy by reducing vram to 128GB * whisper : add whisper_allocr to wrap ggml_allocr * whisper : factor out alloc init in a function * cmake : update to support Metal build * whisper : add <functional> header * objc : fix build (no Metal yet) * ios : add Metal support * swiftui : fix build * metal : speed-up KQ multiplication * metal : sync latest llama.cpp kernels * readme : add Metal info * ios : update submodule * coreml : add code to toggle Core ML config (CPU, ANE, GPU) * bench : fix timings by running a pre-heat * bench : start benching the decoder * whisper : add ggml_mul_mat_pad * bench : fix uninitialized vars * whisper : add comment for disabling mul-mat padding * whisper : add description of ggml_mul_mat_pad * whisper : clean-up ggml_mul_mat_pad * metal : remove the "concurrent" flag * bench : variable n_past * ios : update SPM package	2023-09-15 12:18:18 +03:00
Przemysław Pawełczyk	b55b505690	build : do not use _GNU_SOURCE gratuitously (#1129 ) * Do not use _GNU_SOURCE gratuitously. What is needed to build whisper.cpp and examples is availability of stuff defined in The Open Group Base Specifications Issue 6 (https://pubs.opengroup.org/onlinepubs/009695399/) known also as Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions, plus some stuff from BSD that is not specified in POSIX.1. Well, that was true until NUMA support was added recently in ggml, so enable GNU libc extensions for Linux builds to cover that. There is no need to penalize musl libc which simply follows standards. Not having feature test macros in source code gives greater flexibility to those wanting to reuse it in 3rd party app, as they can build it with minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs. It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2. * examples : include SDL headers before other headers Avoid macOS build error when _DARWIN_C_SOURCE is not defined, brought by SDL2 relying on Darwin extension memset_pattern4/8/16 (from string.h). * make : enable BSD extensions for DragonFlyBSD to expose RLIMIT_MEMLOCK * make : use BSD-specific FTMs to enable alloca on BSDs * make : fix OpenBSD build by exposing newer POSIX definitions * cmake : follow recent FTM improvements from Makefile	2023-09-07 12:36:14 +03:00
Didzis Gosko	64cb45fd79	make : fix detection of AVX2 on macOS (#1250 )	2023-09-06 18:22:21 +03:00
Przemysław Pawełczyk	ba3c333611	make : improve cpuinfo handling on x86 hosts (#1238 ) * make : simplify and correct x86 ISA extensions detection on the host It got broken in commit `c5f9acf4b7` for Haiku and Mac OS (Intel), which report CPU features in upper case. Now we're finding the names in case-insensitive manner and as words. SSE3 detection has been corrected for Linux, which uses PNI for that (Prescott New Instructions). * make : use dmesg.boot in FreeBSD/DragonFlyBSD to detect x86 ISA extensions on the host * make : enable x86 ISA extensions on the host both in CFLAGS and CXXFLAGS * make : correct AVX x86 ISA extension detection on macOS (Intel) host It got broken in commit `c5f9acf4b7`. macOS calls it AVX1.0.	2023-09-05 14:58:47 +03:00
Przemysław Pawełczyk	8e46ba80d3	make : use cpuinfo in MSYS2 to enable x86 ISA extensions on the host (#1216 )	2023-08-28 13:28:26 +03:00
Przemysław Pawełczyk	b0d35995c4	make : add support for building on DragonFlyBSD/NetBSD/OpenBSD (#1212 )	2023-08-27 21:38:46 +03:00
Przemysław Pawełczyk	601c2d2181	ggml : detect SSSE3 (#1211 ) * ggml : add ggml_cpu_has_ssse3 * whisper : show SSSE3 in system info * make : detect SSSE3 via cpuinfo	2023-08-27 21:36:41 +03:00
AustinMroz	175ffa64ee	examples : vim plugin and LSP server (#1144 ) * Initial proof of concept Vim plugin At present, this is likely only slightly better than feature parity with the existing whisper.nvim Known issues: Trailing whitespace Up to an existing length(5 seconds) of speech may be processed when listening is enabled CPU cycles are spent processing speech even when not listening. Fixing these issues is likely dependent upon future efforts to create a dedicated library instead of wrapping examples/stream * Support $WHISPER_CPP_HOME environment variable A minor misunderstanding of the whisper.nvim implementation resulted in a plugin that was functional, but not a drop in replacement as it should be now. * Initial progress on LSP implementation Libcall is nonviable because the library is immediately freed after a call is made. Further investigation has shown Language Server Protocol as a promising alternative that both simplifies the required logic on the vimscript side and increases the ease with which plugins for other editors could be made in the future. This is a very large undertaking and my progress has slowed substantially. Work is far from being in a usable state, but I wish to keep track of major refactors for organizational purposes. * Rewrite audio windowing of guided transcription One of the defining goals of this venture is allowing consecutive commands to be rattled off without the existing deadzones of the current implementation. * Add unguided_transcription. Cleanup. The unguided transcription implantation heavily borrows from existing example implementations and the guided_transcription logic. A high level pass was done to check that method arguments are accurate to what inputs are actually required. A first attempt at cancellation support was added for record keeping, but will be deleted in a future commit. * Fix compilation. Resolves a large number of compilation errors. No testing has been done yet for execution errors. Update Makefile and .gitignore * Functional unguided_transcription * Functional guided_transcription Fix commandset_list being passed by value Properly register the first token of a multitoken command * Minor changes before time fix I've apparently made an awfully major mistake in thinking that unix time was in milliseconds and will be changing all timekeeping code to use standardized methods. In preparation for this is a number of minor bugfixes. Output is manually flushed. An echo method has been added. registerCommandset now wraps the returned index * Swap timekeeping to use std::chrono * Add work in progress lsp backed whisper.vim plugin Current progress blockers are Adding modality awareness to the command processing (specifically, motion prompting) Improving the VAD to be a little more responsive (testing start of activity) * Reworked vim plugin command loop * Fix change inside Multiple bug fixes that, crucially, bring the plugin to the point where a demonstration video is possible Add better echo messaging so whisper_log isn't required Add loading complete message as indicator when listening has started Insert/append are actually included in command sets Some more heavy handed corrections to prevent a double exit when leaving insert mode As a somewhat hacky fix, the very first space is removed when inserting. This cleans up most use cases, but leaves me unsatisfied with the few cases it would be desired. * Forcibly set commandset_index to 0 after subinsert Also remove unnecessary ! to use builtin vim command * Fix upper A minor scope mistake was causing upper'd inputs to be eaten. This was fixed and echoing was slightly improved for clarity. * Fix formatting Corrects indentation to 4 spaces as project standard Slightly better error support for malformed json input * Remove obsolete vim plugin * Add json.hpp library The same library that is used for the llama.cpp server * Minor cleanups add lsp to the make clean directive. remove a redundant params definition. reorder whisper.vim logging for subtranscriptions Corrections to unlets (variables of argument scope appear immutable) * Fix indentation. Fallback for subTranscription Indentation has been changed to 4 spaces. Unit testing has been set up, I'm opting not to include it in the repository for now. It however, has revealed a bug in the state logic where a subtranscription can be initiated without having a saved command When this occurs, append is added as a fallback * Move audio polling logic to a subfunction While work on the improved vad will continue, It's grown to be a little out of scope. Instead, a future commit will perform multiple detection passes at substretches of audio when a backlog of audio exists. To facilitate this, and prevent code duplication, the vad code has been moved into a subfunction shared by both the unguided and guided transcription functions. * Test for voice over subchunks if backlog > 1s As the existing VAD implementation only checks for a falling edge at the end of an audio chunk. It fails to detect voice in cases where the recorded voice is only at the beginning of the audio. To ameliorate this, when the timestamp would cause analysis of audio over a second in length, it is split into 1 second length subchunks which are individually tested. Results are promising, but there seems to be a remaining bug with unguided transcription likely related to saving context * Limit the maximum length of audio input. This existing VAD implementation only detects falling edges, which means any gap in the users speaking is processed for transcription. This simply establishes a constant maximum length depending on the type of transcription. Uguided gets a generous 10 seconds and guided, 2. While quick testing showed that commands are generally around a half a second to a second, limiting commands to an even second resulted in extreme degradation of quality. (Seemingly always the same output for a given commandset) * Unguided timestamp tracking, cleanup Unguided transcriptions where not setup to allow for passing of timestamp data forward, but have been corrected. No_context is now always set to false. While conceptually desirable for the quality of guided transcription, It was seemingly responsible for prior command inputs ghosting in unguided transcription. Save and Run are now tracked by command number instead of command text. While command_text was provided for convenience, I wish to keep command index authoritative. This gives greater consistency and potentially allows for end users to rename or even translate the spoken versions of these commands * By default, maintain mode. Previously, mode was reset to 0 unless otherwise set. In addition to causing some edge cases, this was didn't mesh well with the existing approach to visual mode. With this change, initial tests indicate visual mode is functional. * Add undo breaks before subtranscriptions Subtranscriptions use undo as a hack to allow for partial responses to be displayed. However, scripts don't cause an undo break mid execution unless specifically instructed to. This meant that multiple unguided transcriptions from a single session would cause a latter to undo a former. This is now fixed and undo should be reasonably usable as a command. * Append instead of insert for new undo sequence When entering and leavening insert mode with `i`, the cursor shifts one column to the left. This is remedied by using append instead of insert for setting these breaks in the undo sequence `-` was also added to the pronunciation dictionary to be pronounced as minus as it was causing a particularly high failure rate. * Move undo sequence breaks to command execution Previously, undo sequence breaks were triggered when there was a command that caused a move to insert mode. This caused commands that changed state (like delete or paste) to be bundled together with into the last command that caused text to be entered. * Fix repeat. Add space, carrot, dollar commands Repeat (.) wasn't being tracked properly just like undo and is being manually tracked now. While efforts have been made to properly handle spaces, it was particularly finicky to add a single space when one is needed. A special 'space' command has been added to insert a single space and move the cursor after it. Carrot and Dollar commands have been added for start of line and end of line respectively. These are both simple to implement, and just a matter of defining a pronunciation. * Return error on duplicate in commandset Not every command in the commandset tokenizes to a single token. Because of this, it's possible for that two commands could resolve to the same single token after subsequent tokens are discarded. This commit adds a simple check for duplicates when a commandset is registered and returns an error if so. Additional code will be required later on the vim side to actually process this error. * Add support for user-defined commands This adds a user definable dictionary from spoken keys to strings or funcrefs. All keys are added to the commandlist and when spoken, trigger the corresponding function. Like "save" and "run", these user commands are only available when the command buffer is empty. * Add readme, update cmake * Add area commandset. Refactor spoken_dict Area commands (inside word, around sentence...) have been given a commandset as considered earlier. Verbose definitions for spoken_dict entries now use dicts instead of lists. This shortens the definition for most keys that require it and scales better with the addition of further commandsets * Add mark, jump. Fix change under visual. Mark (m) and jump (') have been added. When a visual selection was executed upon a command that initiated a subtranscription (change) the area of the visual selection is not properly tracked which causes the attempt to stream in partial response to fail. This is solved by disabling partial transcriptions from being streamed when a subtranscription is started while in visual mode. * Accommodate ignorecase. Fix change. From testing on older different versions of vim, the test for distinguishing an 'R' replace all from an 'r' replace could fail if ignorecase was set. The comparison has been changed to explicitly require case matching Change detection has been moved to the execution section as it was missing the change+motion case. * Support registers. Fix README typo There's no logic to prevent doubled register entry, but the functional result is equivalent to if the same key order was typed into vim. A minor typo in the readme. I've mismemorized the mnemonic for 't' as 'to' instead of till., but 'to' can't be used as it's a homophone with '2'. While there was no mistake in the actual logic, it was misleading to use 'to' in the readme.	2023-08-27 21:35:06 +03:00
ardfork	cb5fb0a12d	whisper : initial hipBLAS support (#1209 )	2023-08-27 20:03:58 +03:00
Marcin Mielniczuk	66f2078878	build : fix OpenBLAS detection under Arch Linux (#1173 )	2023-08-25 19:26:34 +03:00
Eric Swanson	8ce20f0f3d	make : fix Linux machines supporting AVX1 not AVX2 (#1162 ) e.g. ancient CPU E5-2670 (v1) See issue #1126 Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-08-25 15:52:22 +03:00
alonfaraj	c5f9acf4b7	make : simplify Makefile (#1147 ) * Simplify Architecture specific in Makefile * unified OS specific check	2023-08-25 15:20:44 +03:00
Christian	ded17dc1cf	make : fix CLBlast build on MacOS (#1120 )	2023-07-25 19:12:03 +03:00
alonfaraj	a0bb409f51	make : check nvcc version and set flag (#1115 )	2023-07-25 19:10:54 +03:00
Jose	1450346214	make : tests can be called as "make tests base.en" (#1113 )	2023-07-25 19:09:38 +03:00
Vadim Peretokin	9ad35bd740	samples : add a larger (30min) sample (#1092 ) Co-authored-by: Vadim Peretokin <vadim.peretokin@carasent.com>	2023-07-25 19:00:45 +03:00
Akash Mahajan	c8d0f5fe98	whisper : support speaker segmentation (local diarization) of mono audio via tinydiarize (#1058 ) * add HuggingFace mirror to download ggml model * support tdrz via simple hack overriding solm tokens * fix incorrect translate/transcribe token_ids that are not static const * add apollo 13 sample for tdrz demo * render [SPEAKER TURN] consistently in all terminal output using vocab.id_to_token * extend whisper_segment with speaker_turn_next field and save in json output * fix failing go build * slipped in some python syntax whoops * whisper : finalize tinydiarize support (add flag + fixes) * whisper : tdrz support for word-level timestamps (respect max_len) * java : try to fix tests after adding tdrz_enable flag * main : remove TODO leftover * java : fix params order list after adding "tdrz_enable" * whisper : fix solm and add nosp token * main : print tinydiarize help --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-07-04 09:45:00 +03:00
Georgi Gerganov	8ba42095c5	Revert "ggml : do not use _GNU_SOURCE gratuitously (#1027 )" This reverts commit `3f7a03ebe3`.	2023-07-02 21:53:52 +03:00
Przemysław Pawełczyk	85ed71aaec	talk-llama : fix build on macOS (#1062 ) * talk-llama : use posix_madvise() instead of madvise() derived from BSD sed -i 's,\<madvise\>,posix_&,g;s,\<MADV_,POSIX_&,g' examples/talk-llama/llama-util.h * make : enable Darwin extensions for macOS builds This is an attempt at fixing macOS build error coming from the fact that RLIMIT_MEMLOCK define is not available there without Darwin extensions.	2023-06-28 22:34:50 +03:00

1 2 3

113 Commits