Commit Graph

754 Commits

Author SHA1 Message Date
Evan Martin
fabf79fc67
whisper : expose API to let user control log output (#1060)
* expose api to let user control log output

Add
  whisper_set_log_callback()
that lets user set a callback for log messages.

Change all the
  fprintf(stderr, ...)
to call via the above.

* whisper : add <cstdarg>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-25 18:58:25 +03:00
Hrishikesh Barman
925915ae37
whisper : move progress calculation out of whisper.cpp (#1081)
Current `progress_step` was hardcoded into whisper.cpp, this resulted in
bindings having to access progress only at that step even if progress
callback was being called at every iteration.

With this change we get greater granularity progress reporting from
whisper.cpp and bindings/implementations can define their own progress step.
2023-07-25 18:53:34 +03:00
AustinMroz
97f4a7fee0
examples : add Vim plugin (#1131)
* Initial proof of concept Vim plugin

At present, this is likely only slightly better than feature parity with
the existing whisper.nvim

Known issues:
 Trailing whitespace
 Up to an existing length(5 seconds) of speech may be processed when
  listening is enabled
 CPU cycles are spent processing speech even when not listening.

Fixing these issues is likely dependent upon future efforts to create a
dedicated library instead of wrapping examples/stream

* Support $WHISPER_CPP_HOME environment variable

A minor misunderstanding of the whisper.nvim implementation resulted in
a plugin that was functional, but not a drop in replacement as it should
be now.
2023-07-25 18:34:23 +03:00
alonfaraj
3998465721
ci : more platforms coverage (#1101)
* add multi platform

* add image name

* fix

* fix /bin/sh path

* add missing \

* add all platforms for check

* remove platforms

* remove s390x

* - add arm v6
- format run cmd

* remove arm v6

* - bump checkout to v3
- use setup emsdk action
- add arch to all ubuntu jobs

* mymindstorm/setup-emsdk to v12

* add missing QEMU step

* add fail-fast: false for debug

* add freebsd

* remark all jobs except freebsd for test

* add sudo

* enable all tests again

* format

* check __AVX__ support before include immintrin.h

* try auto detect flag by cmake

* fix check for immintrin.h

* fix include check for immintrin.h

* Remove all platforms for sanitizer build except amd64

We have no clue why they failed.

---------

Co-authored-by: Alon Faraj <alon.faraj@mapcore.com>
2023-07-16 23:00:34 +03:00
Georgi Gerganov
4774d2feb0
whisper : minor OpenVINO refactoring (#1037)
Hopefully I didn't break something - haven't tested
2023-07-04 20:28:27 +03:00
Travis Cline
6f0114f4a6
go : call SetDuration appropriately (#1077) 2023-07-04 16:13:25 +03:00
Murilo Santana
66616dbd4d
go : fix context.Process call in examples (#1067) 2023-07-04 16:05:35 +03:00
Ryan Metcalfe
62b81276e0
whisper : add OpenVINO support (#1037)
* openvino: use OpenVINO encoder inference

* openvino: add python script for OpenVINO model generation

* whisper: Fix 'unused' warnings when OpenVINO isn't enabled in build

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* whisper: Fix compilation error

* whisper: revert whisper_get_openvino_path_encoder & whisper_get_openvino_path_cache to non-const func signatures

* cmake: Add openvino-encoder as separate object target

* whisper : minor style fixes

* minor : indentation fixes

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-04 15:56:11 +03:00
Martin Warnaar
176d7e4e7b
readme : better wording (#1064) 2023-07-04 15:30:31 +03:00
Georgi Gerganov
70e6fcd78b
readme : add tinydiarize instructions (#1058) 2023-07-04 09:51:22 +03:00
Akash Mahajan
c8d0f5fe98
whisper : support speaker segmentation (local diarization) of mono audio via tinydiarize (#1058)
* add HuggingFace mirror to download  ggml model

* support tdrz via simple hack overriding solm tokens

* fix incorrect translate/transcribe token_ids that are not static const

* add apollo 13 sample for tdrz demo

* render [SPEAKER TURN] consistently in all terminal output using vocab.id_to_token

* extend whisper_segment with speaker_turn_next field and save in json output

* fix failing go build

* slipped in some python syntax whoops

* whisper : finalize tinydiarize support (add flag + fixes)

* whisper : tdrz support for word-level timestamps (respect max_len)

* java : try to fix tests after adding tdrz_enable flag

* main : remove TODO leftover

* java : fix params order list after adding "tdrz_enable"

* whisper : fix solm and add nosp token

* main : print tinydiarize help

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-04 09:45:00 +03:00
Georgi Gerganov
fdf58a6668
talk-llama : fix new rope interface 2023-07-03 19:24:01 +03:00
Georgi Gerganov
8ba42095c5
Revert "ggml : do not use _GNU_SOURCE gratuitously (#1027)"
This reverts commit 3f7a03ebe3.
2023-07-02 21:53:52 +03:00
Georgi Gerganov
d6509bf78d
ggml : sync latest repo (mostly refactoring changes) 2023-07-02 21:46:09 +03:00
Przemysław Pawełczyk
85ed71aaec
talk-llama : fix build on macOS (#1062)
* talk-llama : use posix_madvise() instead of madvise() derived from BSD

sed -i 's,\<madvise\>,posix_&,g;s,\<MADV_,POSIX_&,g' examples/talk-llama/llama-util.h

* make : enable Darwin extensions for macOS builds

This is an attempt at fixing macOS build error coming from the fact that
RLIMIT_MEMLOCK define is not available there without Darwin extensions.
2023-06-28 22:34:50 +03:00
thefinaldegree
49c9472fa0
extra : update 'quantize-all.sh' to quantize all downloaded models (#1054)
Script will now do what it says: quantize everything except testing models in the 'models'  directory.
2023-06-28 22:07:02 +03:00
Georgi Gerganov
72deb41eb2
whisper : split_on_word no longer trims (#1046) 2023-06-25 23:51:01 +03:00
Przemysław Pawełczyk
3f7a03ebe3
ggml : do not use _GNU_SOURCE gratuitously (#1027)
* Do not use _GNU_SOURCE gratuitously.

What is needed to build whisper.cpp and examples is availability of
stuff defined in The Open Group Base Specifications Issue 6
(https://pubs.opengroup.org/onlinepubs/009695399/) known also as
Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions.

There is no need to penalize musl libc which simply follows standards.

Not having feature test macros in source code gives greater flexibility
to those wanting to reuse it in 3rd party app, as they can build it with
minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs.

It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2.

* examples : include SDL headers before other headers

This is an attempt at fixing macOS build error coming from SDL2 relying
on Darwin extension memset_pattern4/8/16 coming from Apple's string.h.
2023-06-25 16:34:30 +03:00
Przemysław Pawełczyk
62642bb61c
talk-llama : fix build after ggml sync (#1049)
sed -i 's,GGML_BACKEND_CUDA,GGML_BACKEND_GPU,g' examples/talk-llama/llama.cpp
2023-06-25 16:13:50 +03:00
Georgi Gerganov
f1c9df5806
metal : sync ggml-metal (ref #1047) 2023-06-25 15:40:39 +03:00
Georgi Gerganov
6c25fae1c4
opencl : sync latest ggml-opencl 2023-06-25 15:38:30 +03:00
Philippe Normand
44cb044e66
whisper : fix build with -Werror=undef (#1045) 2023-06-25 15:30:39 +03:00
Simon Moisselin
6c68218e3c
models : add ggml_to_pt script (#1042)
* adding ggml_to_pt

* typo sys too many args

* fixing swap errors dimensions

---------

Co-authored-by: simonMoisselin <simon.moisselin@gmail.com>
2023-06-25 15:29:54 +03:00
Roddur Dasgupta
f11f33f1c0
models : cd statements are quoted to allow spaces in path (#1041) 2023-06-25 15:27:28 +03:00
Georgi Gerganov
8ac23c9f77
models : handle paths with spaces in download script (close #1038) 2023-06-25 15:23:23 +03:00
Colin
14baf2e7f3
main : add diarization support for all current output types (#1031)
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-06-25 15:07:57 +03:00
GiviMAD
bc2dcf85fe
readme : add java alternative binding (#1029)
Signed-off-by: Miguel Álvarez <miguelwork92@gmail.com>
2023-06-25 14:46:07 +03:00
Jay Binks
1e45911f1a
go : add support for whisper_full_lang_id() (#1010)
* * Add support for whisper_full_lang_id() to go bindings

* Expose token.id so we can test beg, eot etc

---------

Co-authored-by: Jay Binks <jay.binks@overthewire.com.au>
2023-06-25 14:45:33 +03:00
Georgi Gerganov
67564201ec
go : fix "cb" -> "callNewSegment" 2023-06-25 14:34:10 +03:00
Georgi Gerganov
5feb0dffba
ggml : sync latest ggml lib 2023-06-25 14:30:44 +03:00
Bo-Yi Wu
7dfc11843c
go : improve progress reporting and callback handling (#1024)
- Rename `cb` to `callNewSegment` in the `Process` function
- Add `callProgress` as a new parameter to the `Process` function
- Introduce `ProgressCallback` type for reporting progress during processing
- Update `Whisper_full` function to include `progressCallback` parameter
- Add `registerProgressCallback` function and `cbProgress` map for handling progress callbacks

Signed-off-by: appleboy <appleboy.tw@gmail.com>
2023-06-25 14:07:55 +03:00
byte-6174
6a7f3b8db2
make : update cuBLAS build both x86 and aarch64 (#1015)
make cuBLAS compilation compatible with x86 as well as aarch64.
2023-06-25 13:59:48 +03:00
KP Kaiser
207a12f5bc
make : fix for CUDA native not working as an option on Ubuntu (#1012) 2023-06-25 13:57:18 +03:00
faker
26b70395ff
main : exit gracefully when invalid params are passed
* Refactor whisper_params_parse to return false on failure

* Updated help flag behavior
2023-06-25 13:52:29 +03:00
faker
598f607e28
main : gracefully exit when invalid params are passed (#1002)
* Refactor whisper_params_parse to return false on failure

* Updated help flag behavior
2023-06-25 13:51:59 +03:00
Akash Mahajan
3ec7bfffe0
py : make convert-pt-to-ggml.py backwards compatible with older vocab.json tokenizer files (#1001)
* patch checkpoint convert script to keep compatibility with older hf_transformers whisper tokenizer

* typo fix
2023-06-25 13:50:14 +03:00
Larry Battle
a7f822ef59
readme : corrected syntax for markdown link (#995) 2023-06-25 13:46:44 +03:00
Nicholas Albion
57543c169e updated java README 2023-06-06 10:27:26 +10:00
Nicholas Albion
5b9e59bc07 speak scripts for Windows 2023-06-01 22:45:00 +10:00
Nicholas Albion
3f7436e8a0 updated README for java 2023-06-01 16:55:48 +10:00
geniusnut
ce6f747064
whisper.android : support decode wav file has 2 channels (#972) 2023-05-31 10:13:14 +03:00
Nicholas Albion
d7c936b44a
Feature/java bindings2 (#944)
* Java needs to call `whisper_full_default_params_by_ref()`, returning struct by val does not seem to work.
* added convenience methods to WhisperFullParams
* Remove unused WhisperJavaParams
2023-05-29 09:38:58 +10:00
genevera (she/her)
9b926844e3
models : fix README.md (#964)
Fixes typo on line 76 of models/README.md
2023-05-27 10:40:28 +03:00
DGdev91
5e2b3407ef
examples : update elevenlabs scripts to use official python API (#837)
* Update elevenlabs example to use ufficial python API

* Update elevenlabs example to use official python API
2023-05-24 21:11:01 +03:00
0xsourcecode
4e16a8fb63
readme : highlight OpenBLAS support (#956)
* highlight openblas support

* Update README.md
2023-05-24 11:23:51 +03:00
Georgi Gerganov
77eab3fbfe
talk-llama : sync latest llama.cpp (close #922, close #954) 2023-05-23 14:04:39 +03:00
Alexey Kharlamov
041be06d58
cmake : build with any BLAS compatible library (#927)
* Build with any BLAS library

* ci: Removed explicit CUDA nvcc path
2023-05-20 21:23:45 +03:00
Georgi Gerganov
429b9785c0
ggml : update WASM SIMD 2023-05-20 20:00:06 +03:00
Georgi Gerganov
e410cfc3ce
ggml : sync latest ggml repo
- new Q4 and Q8 quantization
- updated CUDA
2023-05-20 18:56:30 +03:00
Nicholas Albion
bc89f285d8
bindings : add java bindings (#931)
* WIP - java bindings

* updated README

* failed attempt at JNI

* fullTranscribe() test passes

* tested on Ubuntu 20

* link to Java bindings
2023-05-20 18:25:02 +03:00