Commit Graph

33 Commits

Author SHA1 Message Date
Sacha Arbonel
4183517076
server : add no-speech threshold parameter and functionality () 2024-12-21 17:00:08 +02:00
Georgi Gerganov
f4668169a0
whisper : rename suppress_non_speech_tokens to suppress_nst () 2024-12-21 12:54:35 +02:00
Sacha Arbonel
944ce49439
server : add option to suppress non-speech tokens ()
* The parameter will suppress non-speech tokens like [LAUGH], [SIGH], etc. from the output when enabled.

* add to whisper_params_parse

* add missing param
2024-12-21 12:05:05 +02:00
Georgi Gerganov
2e59dced12
whisper : rename binaries + fix install ()
* whisper : rename binaries + fix install

* cont : try to fix ci

* cont : fix emscripten builds
2024-12-21 09:43:49 +02:00
gilbertgong
ede1718f6d
server : ffmpeg overwrite leftover temp file ()
* Remove possible leftover ffmpeg temp file from a previous failed conversion

* Revert "Remove possible leftover ffmpeg temp file from a previous failed conversion"

This reverts commit 00797403bd.

* Flag to force ffmpeg to overwrite output file if it exists
2024-10-02 15:06:40 +03:00
Toliver
5b1ce40fa8
server : use OS-generated temp file name for converted files () 2024-09-17 15:56:32 +03:00
Emmanuel Schmidbauer
bec9836849
server : add inference path to make OAI API compatible () 2024-07-08 14:24:58 +03:00
Borislav Stanimirov
af5833e298
whisper : remove speed_up and phase_vocoder* functions ()
* whisper : fix cast warning

* whisper : remove phase_vocoder functions, ref 

* whisper : remove speed_up from whisper_full_params, closes 
2024-05-31 11:37:29 +03:00
Daniel Valdivia
a7dc2aab16
server : fix typo ()
A simple comment typo, PR can be dismissed
2024-05-25 10:46:22 +03:00
Georgi Gerganov
7094ea5e75
whisper : use flash attention ()
* whisper : use flash attention in the encoder

* whisper : add kv_pad

* whisper : remove extra backend instance (huh?)

* whisper : use FA for cross-attention

* whisper : use FA for self-attention

* whisper : simplify encoder FA

* whisper : add flash_attn runtime parameter

* scripts : add bench log

* scripts : add M1 Pro bench log
2024-05-15 09:38:19 +03:00
Georgi Gerganov
4ef8d9f44e
server : return utf-8 () 2024-05-13 15:33:46 +03:00
Emmanuel Schmidbauer
9fab28135c
server : add dtw ()
* server.cpp: add dtw

* Update examples/server/server.cpp

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-15 22:16:58 +03:00
Jo Liss
e7794a868f
examples : rename --audio-context to --audio-ctx per help text () 2024-03-18 17:53:33 +02:00
Felix
07d04280be
examples : clean up common code ()
move some utility functions into common.h
2024-02-19 10:50:15 +02:00
dscripka
a6fb6ab597
examples : added audio_ctx argument to main and server ()
* added audio_ctx argument to main and server examples

* Better default value

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* better default value (again)

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-12 09:19:07 +02:00
Georgi Gerganov
f273e66dc6
examples : initialize context params properly () 2024-02-11 16:39:12 +02:00
Valentin Gosu
80e8a2ea39
server : allow CORS request with authorization headers ()
Whisper plugin in Obsidian requires an API key which is
then sent as an authorization header.
However, the presence of an authorization header requires
a CORS Preflight, so both the OPTIONS method and
the Access-Control-Allow-Headers: authorization must be
handled.
2024-02-09 17:42:41 +02:00
JacobLinCool
baa30bacdb
server : add fields to verbose_json response ()
* server: include additional fields in the verbose_json response as OpenAI does

* server: show request examples on home page

* server: todo note for compression_ratio and no_speech_prob

* server: add simple demo form to the homepage
2024-01-30 14:15:55 +02:00
Ryan Hitchman
c0329acde8
server : implement "verbose_json" format with token details ()
* examples/server: implement "verbose_json" format with token details.

This is intended to mirror the format of openai's Python
whisper.transcribe() return values.

* server: don't write WAV to a temporary file if not converting

* server: use std::lock_guard instead of manual lock/unlock
2024-01-18 22:58:42 +02:00
George Hindle
fbcb52d3cd
server : add more parameters to server api ()
* feat(server): add more parameters to server api

* fix(server): reset params to original parsed values for each request
2024-01-12 13:42:52 +02:00
George Hindle
f7908f9bb8
params : don't compute timestamps when not printing them () 2024-01-12 13:24:38 +02:00
Emmanuel Schmidbauer
c46886f599
server : add request path option() 2024-01-08 22:39:51 +00:00
Georgi Gerganov
022756a872
server : fix server temperature + add temperature_inc ()
* server : fix server temperature + add temperature_inc

* server : change dashes to underscores in parameter names
2024-01-07 13:35:14 +02:00
Oleg Sidorov
3163090d89
server : pass max-len argument to the server ()
This commit fixes the missing parameter binding for max-len between the input
arguments and wparams.
2023-12-05 23:01:45 +02:00
Aleksander Andrzejewski
a0ec3fac54
Server : Add support for .vtt format to Whisper server ()
- The code comes from examples/main
- The output mimetype is set to text/vtt

Example usage:
```shell
curl 127.0.0.1:8080/inference \
-H "Content-Type: multipart/form-data" \
-F file="@samples/jfk.wav" \
-F temperature="0.2" \
-F response-format="vtt"
```
2023-11-30 23:44:26 +00:00
Oleg Sidorov
6559b538e5
server : backport .srt output format ()
This commit adds a support of .srt format to Whisper server. The code is
effectively backported from examples/main. The output mimetype is set to
application/x-subrip as per https://en.wikipedia.org/wiki/SubRip.

Example usage:

  curl 127.0.0.1:8080/inference \
    -H "Content-Type: multipart/form-data" \
    -F file="@<file-path>" \
    -F temperature="0.2" \
    -F response-format="srt"
2023-11-28 15:42:58 +02:00
Kasumi
6b094b6dfe
server : set default CORS headers to allow all () 2023-11-28 11:55:20 +02:00
Ismatulla Mansurov
23c21e92eb
server : automatically convert audio on the server ()
* server : automatically convert audio on the server

* server : remove rebundant comments

* server : automatic conversion refactor

* server : update server readme

* server : remove unnecessary comments and tabs

* server : put back remove calling

* server : apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* server : check ffmpeg before the server lunch

* server : fix indentation

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* server : fix function typo calling

* server : fix function typo calling

* server : add warning in readme

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-27 11:28:34 +02:00
ecneladis
a5881d619c
server : add --print-realtime param ()
* server : add --print-realtime param

* Fix duplicate realtime output
2023-11-24 09:35:02 +02:00
Okabintaro
8328d1900f
fix(server): typo in temperature parameter ()
Also fixed another typo in comments.
2023-11-23 20:59:36 +02:00
Felix
5c7be85fdc
Change temp file name for server application ()
Avoid issue of removing file if it exists in the current working
directory
2023-11-22 09:23:36 +01:00
Felix
9ac88f2b57
Close file after writing in server application ()
Fix of mistake leaving file open while reading it again as wav
2023-11-21 20:36:10 +01:00
Felix
eff3570f78
server : add a REST Whisper server example with OAI-like API ()
* Add first draft of server

* Added json support and base funcs for server.cpp

* Add more user input via api-request

also some clean up

* Add reqest params and load post function

Also some general clean up

* Remove unused function

* Add readme

* Add exception handlers

* Update examples/server/server.cpp

* make : add server target

* Add magic curl syntax

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-20 21:40:24 +02:00