mirror of
https://github.com/ggerganov/whisper.cpp.git
synced 2025-02-22 01:36:46 +00:00
Using a Phase Vocoder for speeding up the audio tempo by scaling down the frequencies in the frequency domain. This reduces the computation in the Encoder by a factor of 2. The transcription accuracy is degraded, but for slow to normal speech - it seems to be still very good. I think this can find application for real-time transcription - i.e. the "stream" example.