Georgi Gerganov
e940fbf283
server : fix build ( #2718 )
CI / ubuntu-latest-clang (linux/amd64, Debug) (push) Waiting to run
CI / ubuntu-latest-clang (linux/amd64, Release) (push) Waiting to run
CI / ubuntu-latest-clang (linux/arm64, Debug) (push) Waiting to run
CI / ubuntu-latest-clang (linux/arm64, Release) (push) Waiting to run
CI / ubuntu-latest-clang (linux/ppc64le, Debug) (push) Waiting to run
CI / ubuntu-latest-clang (linux/ppc64le, Release) (push) Waiting to run
CI / ubuntu-latest-gcc-sanitized (linux/amd64, ADDRESS) (push) Waiting to run
CI / ubuntu-latest-gcc-sanitized (linux/amd64, THREAD) (push) Waiting to run
CI / ubuntu-latest-gcc-sanitized (linux/amd64, UNDEFINED) (push) Waiting to run
CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Waiting to run
CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Waiting to run
CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Waiting to run
CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Waiting to run
CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Waiting to run
CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Waiting to run
CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Waiting to run
CI / windows-cublas (x64, Release, ON, 11.8.0, ON, 2.28.5) (push) Waiting to run
CI / windows-cublas (x64, Release, ON, 12.2.0, ON, 2.28.5) (push) Waiting to run
CI / emscripten (Release) (push) Waiting to run
CI / ios-xcode-build (Release) (push) Waiting to run
CI / android (push) Waiting to run
CI / quantize (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64 tag:main]) (push) Waiting to run
2025-01-13 08:57:33 +02:00
NETZkultur GmbH
45d3faf961
server : generate unique tmp filenames ( #2718 )
...
#Summary
This Merge Request adds a mechanism to generate unique filenames for FFmpeg conversions in whisper_server.cpp. Previously, a single fixed filename was used (e.g., whisper-server-tmp.wav), which could result in unexpected file overwrites under certain circumstances. By generating a unique filename per request, any risk of overwriting temporary files is eliminated.
#Background / Motivation
• Problem: Relying on a static filename for temporary audio files may lead to overwrites if multiple operations occur simultaneously or if the same file name is reused.
• Goal: Dynamically generate unique filenames, ensuring each request or operation uses an isolated temporary file.
2025-01-13 08:55:21 +02:00
Georgi Gerganov
ed09075ca0
server : fix help print
CI / ubuntu-latest-gcc (linux/ppc64le, Debug) (push) Has been cancelled
CI / ubuntu-latest-gcc (linux/ppc64le, Release) (push) Has been cancelled
CI / ubuntu-latest-clang (linux/amd64, Debug) (push) Has been cancelled
CI / ubuntu-latest-clang (linux/amd64, Release) (push) Has been cancelled
CI / ubuntu-latest-clang (linux/arm64, Debug) (push) Has been cancelled
CI / ubuntu-latest-clang (linux/arm64, Release) (push) Has been cancelled
CI / ubuntu-latest-clang (linux/ppc64le, Debug) (push) Has been cancelled
CI / ubuntu-latest-clang (linux/ppc64le, Release) (push) Has been cancelled
CI / ubuntu-latest-gcc-sanitized (linux/amd64, ADDRESS) (push) Has been cancelled
CI / ubuntu-latest-gcc-sanitized (linux/amd64, THREAD) (push) Has been cancelled
CI / ubuntu-latest-gcc-sanitized (linux/amd64, UNDEFINED) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Has been cancelled
CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Has been cancelled
CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Has been cancelled
CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Has been cancelled
CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Has been cancelled
CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Has been cancelled
CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Has been cancelled
CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Has been cancelled
CI / emscripten (Release) (push) Has been cancelled
CI / ios-xcode-build (Release) (push) Has been cancelled
CI / android (push) Has been cancelled
CI / quantize (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64,linux/arm64 tag:main]) (push) Has been cancelled
2024-12-22 15:32:05 +02:00
Sacha Arbonel
4183517076
server : add no-speech threshold parameter and functionality ( #2654 )
2024-12-21 17:00:08 +02:00
Georgi Gerganov
f4668169a0
whisper : rename suppress_non_speech_tokens to suppress_nst ( #2653 )
CI / ubuntu-latest-gcc (linux/ppc64le, Debug) (push) Waiting to run
CI / ubuntu-latest-gcc (linux/ppc64le, Release) (push) Waiting to run
CI / ubuntu-latest-clang (linux/amd64, Debug) (push) Waiting to run
CI / ubuntu-latest-clang (linux/amd64, Release) (push) Waiting to run
CI / ubuntu-latest-clang (linux/arm64, Debug) (push) Waiting to run
CI / ubuntu-latest-clang (linux/arm64, Release) (push) Waiting to run
CI / ubuntu-latest-clang (linux/ppc64le, Debug) (push) Waiting to run
CI / ubuntu-latest-clang (linux/ppc64le, Release) (push) Waiting to run
CI / ubuntu-latest-gcc-sanitized (linux/amd64, ADDRESS) (push) Waiting to run
CI / ubuntu-latest-gcc-sanitized (linux/amd64, THREAD) (push) Waiting to run
CI / ubuntu-latest-gcc-sanitized (linux/amd64, UNDEFINED) (push) Waiting to run
CI / ubuntu-22-cmake-sycl (linux/amd64, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl (linux/arm/v7, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl (linux/arm64, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl (linux/ppc64le, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl-fp16 (linux/amd64, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl-fp16 (linux/arm/v7, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl-fp16 (linux/arm64, icx, icpx, ON) (push) Waiting to run
CI / ubuntu-22-cmake-sycl-fp16 (linux/ppc64le, icx, icpx, ON) (push) Waiting to run
CI / windows-msys2 (Release, clang-x86_64, CLANG64) (push) Waiting to run
CI / windows-msys2 (Release, ucrt-x86_64, UCRT64) (push) Waiting to run
CI / windows (Win32, Release, win32-x86, x86, 2.28.5, ON) (push) Waiting to run
CI / windows (x64, Release, win32-x86-64, x64, 2.28.5, ON) (push) Waiting to run
CI / windows-blas (Win32, ON, Release, x86, 2.28.5, ON) (push) Waiting to run
CI / windows-blas (x64, ON, Release, x64, 2.28.5, ON) (push) Waiting to run
CI / emscripten (Release) (push) Waiting to run
CI / ios-xcode-build (Release) (push) Waiting to run
CI / android (push) Waiting to run
CI / quantize (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/main.Dockerfile platform:linux/amd64,linux/arm64 tag:main]) (push) Waiting to run
2024-12-21 12:54:35 +02:00
Sacha Arbonel
944ce49439
server : add option to suppress non-speech tokens ( #2649 )
...
* The parameter will suppress non-speech tokens like [LAUGH], [SIGH], etc. from the output when enabled.
* add to whisper_params_parse
* add missing param
2024-12-21 12:05:05 +02:00
Georgi Gerganov
2e59dced12
whisper : rename binaries + fix install ( #2648 )
...
* whisper : rename binaries + fix install
* cont : try to fix ci
* cont : fix emscripten builds
2024-12-21 09:43:49 +02:00
gilbertgong
ede1718f6d
server : ffmpeg overwrite leftover temp file ( #2431 )
...
* Remove possible leftover ffmpeg temp file from a previous failed conversion
* Revert "Remove possible leftover ffmpeg temp file from a previous failed conversion"
This reverts commit 00797403bd43ebcb1e0678989a4fc676d417b4af.
* Flag to force ffmpeg to overwrite output file if it exists
2024-10-02 15:06:40 +03:00
Toliver
5b1ce40fa8
server : use OS-generated temp file name for converted files ( #2419 )
2024-09-17 15:56:32 +03:00
Emmanuel Schmidbauer
bec9836849
server : add inference path to make OAI API compatible ( #2270 )
2024-07-08 14:24:58 +03:00
Borislav Stanimirov
af5833e298
whisper : remove speed_up
and phase_vocoder*
functions ( #2198 )
...
* whisper : fix cast warning
* whisper : remove phase_vocoder functions, ref #2195
* whisper : remove speed_up from whisper_full_params, closes #2195
2024-05-31 11:37:29 +03:00
Daniel Valdivia
a7dc2aab16
server : fix typo ( #2181 )
...
A simple comment typo, PR can be dismissed
2024-05-25 10:46:22 +03:00
Georgi Gerganov
7094ea5e75
whisper : use flash attention ( #2152 )
...
* whisper : use flash attention in the encoder
* whisper : add kv_pad
* whisper : remove extra backend instance (huh?)
* whisper : use FA for cross-attention
* whisper : use FA for self-attention
* whisper : simplify encoder FA
* whisper : add flash_attn runtime parameter
* scripts : add bench log
* scripts : add M1 Pro bench log
2024-05-15 09:38:19 +03:00
Georgi Gerganov
4ef8d9f44e
server : return utf-8 ( #2138 )
2024-05-13 15:33:46 +03:00
Emmanuel Schmidbauer
9fab28135c
server : add dtw ( #2044 )
...
* server.cpp: add dtw
* Update examples/server/server.cpp
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-15 22:16:58 +03:00
Jo Liss
e7794a868f
examples : rename --audio-context to --audio-ctx per help text ( #1953 )
2024-03-18 17:53:33 +02:00
Felix
07d04280be
examples : clean up common code ( #1871 )
...
move some utility functions into common.h
2024-02-19 10:50:15 +02:00
dscripka
a6fb6ab597
examples : added audio_ctx argument to main and server ( #1857 )
...
* added audio_ctx argument to main and server examples
* Better default value
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* better default value (again)
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-12 09:19:07 +02:00
Georgi Gerganov
f273e66dc6
examples : initialize context params properly ( #1852 )
2024-02-11 16:39:12 +02:00
Valentin Gosu
80e8a2ea39
server : allow CORS request with authorization headers ( #1850 )
...
Whisper plugin in Obsidian requires an API key which is
then sent as an authorization header.
However, the presence of an authorization header requires
a CORS Preflight, so both the OPTIONS method and
the Access-Control-Allow-Headers: authorization must be
handled.
2024-02-09 17:42:41 +02:00
JacobLinCool
baa30bacdb
server : add fields to verbose_json
response ( #1802 )
...
* server: include additional fields in the verbose_json response as OpenAI does
* server: show request examples on home page
* server: todo note for compression_ratio and no_speech_prob
* server: add simple demo form to the homepage
2024-01-30 14:15:55 +02:00
Ryan Hitchman
c0329acde8
server : implement "verbose_json" format with token details ( #1781 )
...
* examples/server: implement "verbose_json" format with token details.
This is intended to mirror the format of openai's Python
whisper.transcribe() return values.
* server: don't write WAV to a temporary file if not converting
* server: use std::lock_guard instead of manual lock/unlock
2024-01-18 22:58:42 +02:00
George Hindle
fbcb52d3cd
server : add more parameters to server api ( #1754 )
...
* feat(server): add more parameters to server api
* fix(server): reset params to original parsed values for each request
2024-01-12 13:42:52 +02:00
George Hindle
f7908f9bb8
params : don't compute timestamps when not printing them ( #1755 )
2024-01-12 13:24:38 +02:00
Emmanuel Schmidbauer
c46886f599
server : add request path option( #1741 )
2024-01-08 22:39:51 +00:00
Georgi Gerganov
022756a872
server : fix server temperature + add temperature_inc ( #1729 )
...
* server : fix server temperature + add temperature_inc
* server : change dashes to underscores in parameter names
2024-01-07 13:35:14 +02:00
Oleg Sidorov
3163090d89
server : pass max-len argument to the server ( #1574 )
...
This commit fixes the missing parameter binding for max-len between the input
arguments and wparams.
2023-12-05 23:01:45 +02:00
Aleksander Andrzejewski
a0ec3fac54
Server : Add support for .vtt format to Whisper server ( #1578 )
...
- The code comes from examples/main
- The output mimetype is set to text/vtt
Example usage:
```shell
curl 127.0.0.1:8080/inference \
-H "Content-Type: multipart/form-data" \
-F file="@samples/jfk.wav" \
-F temperature="0.2" \
-F response-format="vtt"
```
2023-11-30 23:44:26 +00:00
Oleg Sidorov
6559b538e5
server : backport .srt output format ( #1565 )
...
This commit adds a support of .srt format to Whisper server. The code is
effectively backported from examples/main. The output mimetype is set to
application/x-subrip as per https://en.wikipedia.org/wiki/SubRip .
Example usage:
curl 127.0.0.1:8080/inference \
-H "Content-Type: multipart/form-data" \
-F file="@<file-path>" \
-F temperature="0.2" \
-F response-format="srt"
2023-11-28 15:42:58 +02:00
Kasumi
6b094b6dfe
server : set default CORS headers to allow all ( #1567 )
2023-11-28 11:55:20 +02:00
Ismatulla Mansurov
23c21e92eb
server : automatically convert audio on the server ( #1539 )
...
* server : automatically convert audio on the server
* server : remove rebundant comments
* server : automatic conversion refactor
* server : update server readme
* server : remove unnecessary comments and tabs
* server : put back remove calling
* server : apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* server : check ffmpeg before the server lunch
* server : fix indentation
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* server : fix function typo calling
* server : fix function typo calling
* server : add warning in readme
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-27 11:28:34 +02:00
ecneladis
a5881d619c
server : add --print-realtime param ( #1541 )
...
* server : add --print-realtime param
* Fix duplicate realtime output
2023-11-24 09:35:02 +02:00
Okabintaro
8328d1900f
fix(server): typo in temperature parameter ( #1545 )
...
Also fixed another typo in comments.
2023-11-23 20:59:36 +02:00
Felix
5c7be85fdc
Change temp file name for server application ( #1535 )
...
Avoid issue of removing file if it exists in the current working
directory
2023-11-22 09:23:36 +01:00
Felix
9ac88f2b57
Close file after writing in server application ( #1533 )
...
Fix of mistake leaving file open while reading it again as wav
2023-11-21 20:36:10 +01:00
Felix
eff3570f78
server : add a REST Whisper server example with OAI-like API ( #1380 )
...
* Add first draft of server
* Added json support and base funcs for server.cpp
* Add more user input via api-request
also some clean up
* Add reqest params and load post function
Also some general clean up
* Remove unused function
* Add readme
* Add exception handlers
* Update examples/server/server.cpp
* make : add server target
* Add magic curl syntax
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-20 21:40:24 +02:00