Georgi Gerganov
cf67bfffa0
Fix EOT token handling
...
If it is the end of the audio, pick all sampled tokens.
Otherwise, print error message.
2022-10-18 00:53:06 +03:00
Georgi Gerganov
91632eb6ea
Revert GELU change
...
Seems it does not work on x86 for some reason
2022-10-18 00:45:08 +03:00
Georgi Gerganov
b81a81d543
Link Accelerate framework to "stream" example
2022-10-18 00:12:51 +03:00
Georgi Gerganov
d14823582d
Try to improve the sampling strategy a bit
...
It sill fails sometimes when it does not sample a timestamp token for
the entire segment. We now print a message in such cases
2022-10-18 00:12:51 +03:00
Georgi Gerganov
20d8e7a309
Fix memory sizes
2022-10-18 00:12:51 +03:00
Georgi Gerganov
72d967bce4
Use Accelerate framework on Apple silicon
...
Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro)
Also various extra optimizations:
- Multi-threaded NORM operator
- Faster GELU via F16 cast
2022-10-18 00:12:51 +03:00
Georgi Gerganov
130b5c02d6
Adding helper script for converting the PT models
2022-10-18 00:12:51 +03:00
Georgi Gerganov
0e858f080d
close #56 : build on FreeBSD
...
Thanks to @abelbabel for the contribution
2022-10-17 18:10:16 +03:00
Georgi Gerganov
f24d940ca9
Merge pull request #58 from r0y6a3n0/master
...
fix decode missing token issue
2022-10-17 18:06:02 +03:00
RyanChang
949f97a8b4
fix missing token issue
2022-10-17 21:19:45 +08:00
Georgi Gerganov
0ad085f5e8
ref #48 : clear results at the start of whisper_full
...
This way, even if the input audio is empty, the previous results will be
removed.
2022-10-15 09:55:28 +03:00
Georgi Gerganov
36945162fa
Update README.md (ref #50 )
2022-10-15 09:40:08 +03:00
Georgi Gerganov
b2f1600aa3
Update README.md
2022-10-12 21:25:42 +03:00
0/0
b799226973
check if spectogram length is <100 before doing anything else
...
fixes #39
2022-10-12 07:32:42 +03:00
Topping1
1348796a93
Update README.md ( #43 )
...
* Update README.md
Updated README.md to list new features, such as subtitle file support (VTT and SRT)
* Update README.md
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2022-10-12 07:32:14 +03:00
Georgi Gerganov
40609cb49b
Merge pull request #42 from iboB/msvc-build
...
ref #5 : MSVC build
2022-10-12 07:31:41 +03:00
Borislav Stanimirov
0b45d25151
Building with MSVC
2022-10-11 21:40:46 +03:00
Borislav Stanimirov
28252352d7
Visual Studio ignored dirs
2022-10-11 20:57:33 +03:00
Georgi Gerganov
8d94358251
Update README.md
2022-10-11 00:36:32 +03:00
Georgi Gerganov
ad6693fb64
Update README.md
2022-10-10 22:16:25 +03:00
Georgi Gerganov
01c9e96f64
stream : improve real-time transcription
2022-10-10 22:06:27 +03:00
Georgi Gerganov
63b6786767
Minor
2022-10-10 22:06:27 +03:00
Georgi Gerganov
f7ab81fe51
Update README.md
2022-10-10 22:05:37 +03:00
Georgi Gerganov
eac4f12777
Merge pull request #36 from Topping1/master
...
Fix SRT timestamp format from mm:ss.sss to hh:mm:ss.sss
2022-10-10 09:13:31 +03:00
Georgi Gerganov
9d5723435f
ref #35 : add <stdbool.h> to whisper.h
...
"bool" type is not implicitly defined for some compilers.
2022-10-10 08:11:18 +03:00
Georgi Gerganov
6e29d8453c
Merge pull request #34 from tazz4843/master
...
Add static library make target
2022-10-10 08:05:57 +03:00
Topping1
50b5fe964c
Update main.cpp
2022-10-09 23:35:10 -05:00
0/0
64752acd27
add static library make target
2022-10-09 19:16:42 -06:00
Georgi Gerganov
7edaa7da4b
Merge pull request #31 from lkwq007/master
...
Add MinGW support
2022-10-09 17:52:46 +03:00
lnyan
4bbb8a587b
Add MinGW support
2022-10-09 22:26:37 +08:00
Georgi Gerganov
4a6bf11db3
Minor
2022-10-08 18:13:26 +03:00
Georgi Gerganov
9bbca3110f
ref #9 : add API documentation in whisper.h
2022-10-08 18:09:56 +03:00
Georgi Gerganov
5e563ef635
Fix Makefile for MacBook Intel
2022-10-08 17:35:55 +03:00
Georgi Gerganov
2ca8cc77b2
ref #17 : print whisper logs to stderr
...
Only the transcribed/translted text is printed to stdout.
This way, one can redirect the result to a file.
2022-10-08 17:28:06 +03:00
Georgi Gerganov
8c7c018893
ref #17 : add options to output result to file
...
Support for:
- plain text
- VTT
- SRT
2022-10-08 17:22:22 +03:00
Georgi Gerganov
4c4ab71d4d
Update README.md
2022-10-08 11:46:34 +03:00
Georgi Gerganov
b43b36e006
Update tests
2022-10-08 11:43:42 +03:00
Georgi Gerganov
37110d693e
ci : add base model tests to GH Actions
2022-10-08 11:43:42 +03:00
Georgi Gerganov
2d47693435
Update README.md
2022-10-08 11:43:42 +03:00
Georgi Gerganov
a53e06757f
Create README.md
2022-10-08 11:43:42 +03:00
Georgi Gerganov
0e3ba2f9fc
Adding dummy models for testing purposes
2022-10-08 11:43:42 +03:00
Georgi Gerganov
2f069335ab
Adding sanitizer tests
2022-10-08 11:43:42 +03:00
Georgi Gerganov
29b041f79b
Cleanup CMakeLists.txt
2022-10-08 09:02:41 +03:00
Georgi Gerganov
4a732b2879
cmake : fixes
2022-10-08 09:02:41 +03:00
Georgi Gerganov
68f5962be6
ci : add cmake builds
2022-10-08 09:02:41 +03:00
Georgi Gerganov
332c9d77fe
whisper : fix bug in token sampling logic
...
Could overflow buffer
2022-10-08 09:02:41 +03:00
Georgi Gerganov
877c058179
Add CMake support
2022-10-08 09:02:41 +03:00
Georgi Gerganov
481cd685d5
ref #10 : option to keep context in "stream" example
...
Seems the results become worse when we keep the context, so by default
this is not enabled
2022-10-07 22:30:44 +03:00
Georgi Gerganov
3f15bb8a08
ref #10 : add "step" argument for "stream" example
...
Controls how often we run the inference.
By default, we run it every 3 seconds.
2022-10-07 22:07:24 +03:00
Georgi Gerganov
7787b878e1
ref #16 , #22 : add "offset" argument
...
Allows to start processing the input audio at some offset from the
beginning. Useful for splitting a long job into multiple tasks.
2022-10-07 22:00:40 +03:00