whisper : restore decoder temperature fallbacks

I disabled this because there were many complaints about slow decoding. The current implementation does not allow batching the decoders when using the "best of" or "beam size" parameters, so the decoding time is proportional to the number of decoders, which is obviously not great. However, now there are even more complaints about wrong decodings and repetition. So, making a compromise by re-enabling the fallbacks, but defaulting to just 2 "best of" / "beam size" decoders. Also, the temperature step is increased from 0.2 to 0.4 - i.e. from maximum of 5 fallbacks to maximum of 2. Also, the stream example now has fallbacks enabled by default. close #471 #477 #508 #612 #719 #731
2025-06-12 20:18:08 +00:00 · 2023-04-15 16:04:07 +03:00
parent ea1f8a50d4
commit f19e23fbd1
3 changed files with 25 additions and 21 deletions
--- a/examples/main/main.cpp
+++ b/examples/main/main.cpp
@ -57,7 +57,7 @@ struct whisper_params {
    int32_t duration_ms  =  0;
    int32_t max_context  = -1;
    int32_t max_len      =  0;
-    int32_t best_of      =  5;
+    int32_t best_of      =  2;
    int32_t beam_size    = -1;

    float word_thold    =  0.01f;