Georgi Gerganov
b6c5f49b78
whisper : add batched decoding (#1486)
* whisper : add whisper_batch
* whisper : move kv_self to whisper_state
* whisper : full batched decoding support
* whisper : fix memory leak in whisper_batch
* whisper : fix mem leak again + remove oboslete function
* whisper : clear kv cache when using whisper_decode API
* whisper : speed-up sampling
* whisper : fix decoders initializer
* bench : add batch size 5 bench
* whisper : add comment about the KV cache size
* whisper : add check for max number of decoders
* whisper : avoid starting sampling threads with bs=1
* whisper : enable beam-search by default
* cuda : sync llama.cpp fixes
2023-11-15 16:12:52 +02:00
..
2023-11-06 11:04:24 +02:00
2023-11-15 16:12:52 +02:00
2023-11-07 15:30:18 +02:00
2023-11-13 10:51:34 +02:00
2023-11-06 11:04:24 +02:00
2023-11-06 11:04:24 +02:00
2023-11-15 16:12:52 +02:00
2023-08-06 11:04:42 +03:00
2023-11-06 11:04:24 +02:00
2023-11-06 11:04:24 +02:00
2023-11-12 15:31:08 +02:00
2023-11-13 10:04:16 +02:00
2023-11-06 11:04:24 +02:00
2023-11-12 18:31:58 +02:00
2023-11-12 18:31:58 +02:00
2023-08-27 21:35:06 +03:00
2023-11-06 11:04:24 +02:00
2023-11-07 23:53:31 +02:00
2023-11-06 11:04:24 +02:00
2023-11-13 10:51:34 +02:00
2023-06-25 14:30:44 +03:00
2023-04-30 18:51:57 +03:00
2023-02-15 19:28:10 +02:00
2023-02-15 19:28:10 +02:00
2023-11-03 21:35:05 +02:00
2023-11-12 15:31:08 +02:00
2022-10-25 20:53:48 +03:00
2022-11-26 10:22:42 +02:00
2023-11-13 10:51:34 +02:00
2023-11-13 10:51:34 +02:00
2023-04-30 18:51:57 +03:00
2023-11-07 15:30:18 +02:00
2023-11-07 15:30:18 +02:00
2023-02-18 09:42:31 +02:00