whisper : add "split_on_word" flag when using using "max_len" option (#455)

* Update whisper.cpp

* fix: trim function

* feat: added flag to split on word

* fix: arguments for main
This commit is contained in:
Matija Pevec
2023-02-05 13:44:23 +01:00
committed by GitHub
parent b2083c5d02
commit d012b5c7e4
3 changed files with 39 additions and 5 deletions

View File

@ -257,6 +257,7 @@ extern "C" {
float thold_pt; // timestamp token probability threshold (~0.01)
float thold_ptsum; // timestamp token sum probability threshold (~0.01)
int max_len; // max segment length in characters
bool split_on_word; // split on word rather than on token (when used with max_len)
int max_tokens; // max tokens per segment (0 = no limit)
// [EXPERIMENTAL] speed-up techniques