* add detectlanguage flag
* renaming and help
* no idea why that last one didn't commit
* run language detection if dl is set
* help message fix
* various fixes
* fix quitting
* fix language being english on print
Whenever an `offset_ms` is provided, the value of `seek_end` is
calculated incorrectly. This causes Whisper to keep transcribing
after the end of the file.
The current behavior looks like
```
[00:34:40.000 --> 00:34:47.000] This is an example audio file.
[00:34:47.000 --> 00:34:49.000] The text has been redacted
[00:34:49.000 --> 00:34:51.000] This is the end of the audio.
[00:34:51.000 --> 00:34:52.000] ***
[00:34:52.000 --> 00:34:53.000] ***
[00:34:53.000 --> 00:34:54.000] ***
[00:34:55.000 --> 00:34:56.000] ***
...
```
The expected behavior should be
```
[00:34:40.000 --> 00:34:47.000] This is an example audio file.
[00:34:47.000 --> 00:34:49.000] The text has been redacted
[00:34:49.000 --> 00:34:51.000] This is the end of the audio.
- end of program -
```
This commit changes the calculation of the `seek_end` variable to
only add `seek_start` if a custom `duration_ms` is provided.
Otherwise, it defaults to the end of the file.
Signed-off-by: Thijs Raymakers <thijs@raymakers.nl>
if the Core ML model cannot be loaded, continue without Core ML instead of
returning. This allows a single build to transcribe using Core ML models
where available, and regular models when not.
I disabled this because there were many complaints about slow decoding.
The current implementation does not allow batching the decoders when
using the "best of" or "beam size" parameters, so the decoding time is
proportional to the number of decoders, which is obviously not great.
However, now there are even more complaints about wrong decodings and
repetition.
So, making a compromise by re-enabling the fallbacks, but defaulting to
just 2 "best of" / "beam size" decoders. Also, the temperature step is
increased from 0.2 to 0.4 - i.e. from maximum of 5 fallbacks to maximum
of 2.
Also, the stream example now has fallbacks enabled by default.
close#471#477#508#612#719#731
* examples : provide option for exporting also as JSON file (ggerganov/whisper.cpp#614)
* main : remove leftovers
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* whisper.android: Support benchmark for Android example.
* whisper.android: update screenshot in README.
* update: Make text selectable for copy & paste.
* Update whisper.h to restore API name
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* whisper.android: Restore original API names.
---------
Co-authored-by: tinoue <tinoue@xevo.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Added whisper state + default state on the whisper_context
* Fixed some examples and bindings
* Fixed whisper_n_len (which was used in some binding) and added whisper_n_len_from_state
* Fixed comments
* whisper : reuse kv_cache_free() and fix compiler warnings
* whisper : clean-up the API comments
---------
Co-authored-by: Sandro Hanea <sandrohanea@microsoft.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This bug has been present since v1.1.0.
Effectively, the past transcribed text wasn't being used for following
transcriptions, which likely significantly reduces the transcription
quality.
Likely related to #419