Commit Graph

734 Commits

Author SHA1 Message Date
Georgi Gerganov
cf67bfffa0 Fix EOT token handling
If it is the end of the audio, pick all sampled tokens.
Otherwise, print error message.
2022-10-18 00:53:06 +03:00
Georgi Gerganov
91632eb6ea Revert GELU change
Seems it does not work on x86 for some reason
2022-10-18 00:45:08 +03:00
Georgi Gerganov
b81a81d543 Link Accelerate framework to "stream" example 2022-10-18 00:12:51 +03:00
Georgi Gerganov
d14823582d Try to improve the sampling strategy a bit
It sill fails sometimes when it does not sample a timestamp token for
the entire segment. We now print a message in such cases
2022-10-18 00:12:51 +03:00
Georgi Gerganov
20d8e7a309 Fix memory sizes 2022-10-18 00:12:51 +03:00
Georgi Gerganov
72d967bce4 Use Accelerate framework on Apple silicon
Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro)

Also various extra optimizations:

- Multi-threaded NORM operator
- Faster GELU via F16 cast
2022-10-18 00:12:51 +03:00
Georgi Gerganov
130b5c02d6 Adding helper script for converting the PT models 2022-10-18 00:12:51 +03:00
Georgi Gerganov
0e858f080d
close #56 : build on FreeBSD
Thanks to @abelbabel for the contribution
2022-10-17 18:10:16 +03:00
Georgi Gerganov
f24d940ca9
Merge pull request #58 from r0y6a3n0/master
fix decode missing token issue
2022-10-17 18:06:02 +03:00
RyanChang
949f97a8b4 fix missing token issue 2022-10-17 21:19:45 +08:00
Georgi Gerganov
0ad085f5e8
ref #48 : clear results at the start of whisper_full
This way, even if the input audio is empty, the previous results will be
removed.
2022-10-15 09:55:28 +03:00
Georgi Gerganov
36945162fa
Update README.md (ref #50) 2022-10-15 09:40:08 +03:00
Georgi Gerganov
b2f1600aa3
Update README.md 2022-10-12 21:25:42 +03:00
0/0
b799226973 check if spectogram length is <100 before doing anything else
fixes #39
2022-10-12 07:32:42 +03:00
Topping1
1348796a93
Update README.md (#43)
* Update README.md

Updated README.md to list new features, such as subtitle file support (VTT and SRT)

* Update README.md

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2022-10-12 07:32:14 +03:00
Georgi Gerganov
40609cb49b
Merge pull request #42 from iboB/msvc-build
ref #5 : MSVC build
2022-10-12 07:31:41 +03:00
Borislav Stanimirov
0b45d25151 Building with MSVC 2022-10-11 21:40:46 +03:00
Borislav Stanimirov
28252352d7 Visual Studio ignored dirs 2022-10-11 20:57:33 +03:00
Georgi Gerganov
8d94358251
Update README.md 2022-10-11 00:36:32 +03:00
Georgi Gerganov
ad6693fb64
Update README.md 2022-10-10 22:16:25 +03:00
Georgi Gerganov
01c9e96f64
stream : improve real-time transcription 2022-10-10 22:06:27 +03:00
Georgi Gerganov
63b6786767
Minor 2022-10-10 22:06:27 +03:00
Georgi Gerganov
f7ab81fe51
Update README.md 2022-10-10 22:05:37 +03:00
Georgi Gerganov
eac4f12777
Merge pull request #36 from Topping1/master
Fix SRT timestamp format from mm:ss.sss to hh:mm:ss.sss
2022-10-10 09:13:31 +03:00
Georgi Gerganov
9d5723435f
ref #35 : add <stdbool.h> to whisper.h
"bool" type is not implicitly defined for some compilers.
2022-10-10 08:11:18 +03:00
Georgi Gerganov
6e29d8453c
Merge pull request #34 from tazz4843/master
Add static library make target
2022-10-10 08:05:57 +03:00
Topping1
50b5fe964c
Update main.cpp 2022-10-09 23:35:10 -05:00
0/0
64752acd27
add static library make target 2022-10-09 19:16:42 -06:00
Georgi Gerganov
7edaa7da4b
Merge pull request #31 from lkwq007/master
Add MinGW support
2022-10-09 17:52:46 +03:00
lnyan
4bbb8a587b Add MinGW support 2022-10-09 22:26:37 +08:00
Georgi Gerganov
4a6bf11db3 Minor 2022-10-08 18:13:26 +03:00
Georgi Gerganov
9bbca3110f ref #9 : add API documentation in whisper.h 2022-10-08 18:09:56 +03:00
Georgi Gerganov
5e563ef635 Fix Makefile for MacBook Intel 2022-10-08 17:35:55 +03:00
Georgi Gerganov
2ca8cc77b2 ref #17 : print whisper logs to stderr
Only the transcribed/translted text is printed to stdout.
This way, one can redirect the result to a file.
2022-10-08 17:28:06 +03:00
Georgi Gerganov
8c7c018893 ref #17 : add options to output result to file
Support for:

- plain text
- VTT
- SRT
2022-10-08 17:22:22 +03:00
Georgi Gerganov
4c4ab71d4d
Update README.md 2022-10-08 11:46:34 +03:00
Georgi Gerganov
b43b36e006 Update tests 2022-10-08 11:43:42 +03:00
Georgi Gerganov
37110d693e ci : add base model tests to GH Actions 2022-10-08 11:43:42 +03:00
Georgi Gerganov
2d47693435 Update README.md 2022-10-08 11:43:42 +03:00
Georgi Gerganov
a53e06757f Create README.md 2022-10-08 11:43:42 +03:00
Georgi Gerganov
0e3ba2f9fc Adding dummy models for testing purposes 2022-10-08 11:43:42 +03:00
Georgi Gerganov
2f069335ab Adding sanitizer tests 2022-10-08 11:43:42 +03:00
Georgi Gerganov
29b041f79b Cleanup CMakeLists.txt 2022-10-08 09:02:41 +03:00
Georgi Gerganov
4a732b2879 cmake : fixes 2022-10-08 09:02:41 +03:00
Georgi Gerganov
68f5962be6 ci : add cmake builds 2022-10-08 09:02:41 +03:00
Georgi Gerganov
332c9d77fe whisper : fix bug in token sampling logic
Could overflow buffer
2022-10-08 09:02:41 +03:00
Georgi Gerganov
877c058179 Add CMake support 2022-10-08 09:02:41 +03:00
Georgi Gerganov
481cd685d5
ref #10 : option to keep context in "stream" example
Seems the results become worse when we keep the context, so by default
this is not enabled
2022-10-07 22:30:44 +03:00
Georgi Gerganov
3f15bb8a08
ref #10 : add "step" argument for "stream" example
Controls how often we run the inference.
By default, we run it every 3 seconds.
2022-10-07 22:07:24 +03:00
Georgi Gerganov
7787b878e1
ref #16, #22 : add "offset" argument
Allows to start processing the input audio at some offset from the
beginning. Useful for splitting a long job into multiple tasks.
2022-10-07 22:00:40 +03:00