whisper : restore decoder temperature fallbacks

I disabled this because there were many complaints about slow decoding.
The current implementation does not allow batching the decoders when
using the "best of" or "beam size" parameters, so the decoding time is
proportional to the number of decoders, which is obviously not great.

However, now there are even more complaints about wrong decodings and
repetition.

So, making a compromise by re-enabling the fallbacks, but defaulting to
just 2 "best of" / "beam size" decoders. Also, the temperature step is
increased from 0.2 to 0.4 - i.e. from maximum of 5 fallbacks to maximum
of 2.

Also, the stream example now has fallbacks enabled by default.

close #471 #477 #508 #612 #719 #731
This commit is contained in:
Georgi Gerganov
2023-04-15 16:04:07 +03:00
parent ea1f8a50d4
commit f19e23fbd1
3 changed files with 25 additions and 21 deletions

View File

@ -57,7 +57,7 @@ struct whisper_params {
int32_t duration_ms = 0;
int32_t max_context = -1;
int32_t max_len = 0;
int32_t best_of = 5;
int32_t best_of = 2;
int32_t beam_size = -1;
float word_thold = 0.01f;