Commit Graph

829 Commits

Author SHA1 Message Date
Georgi Gerganov
8171e621fc
sync : ggml (Metal fixes, new ops, tests) (#1633)
* sync : ggml (Metal fixes, new ops, tests)

* cuda : fix bin bcast when src1 and dst have different types
2023-12-13 21:55:03 +02:00
Kreijstal
ec03661b20
cmake : target windows 8 or above for prefetchVirtualMemory in llama-talk (#1617)
Since we use prefetchVirtualMemory we specify we target win 8 or above, otherwise other compilers will refuse to use the prefetchVirtualMemory api, (I understand you are loading it dynamically but the header definition has this limitation)
2023-12-12 11:35:00 +00:00
Kreijstal
6335933a5b
cmake : Fix bug in httplib.h for mingw (#1615)
Fix bug in httlib.h for mingw, please see https://github.com/yhirose/cpp-httplib/issues/1669
2023-12-10 17:47:52 +00:00
Finn Voorhees
885b5563d0
metal : fix ggml_metal_log vargs (#1606) 2023-12-08 13:50:50 +02:00
Georgi Gerganov
9521ba6801
whisper.objc : disable timestamps for real-time transcription 2023-12-08 13:43:37 +02:00
Georgi Gerganov
29511d33c7
whisper : more debug messages + fix fallback logic 2023-12-08 13:43:12 +02:00
Georgi Gerganov
7bc4d22337
metal : fix soft_max kernel src1 argument (#1602) 2023-12-08 13:39:32 +02:00
Georgi Gerganov
afce6fa113
sync : ggml (new ops, new backend, etc) (#1602)
* sync : ggml (new ops, new backend, etc)

* whisper : remove obsolete broadcasting code

* ggml : remove backend self-registers + fix ggml_concat + n_task logic

* metal : fix assert

* metal : print resource path

* whisper : fix bug if metal init fails
2023-12-07 22:27:19 +02:00
Oleg Sidorov
3163090d89
server : pass max-len argument to the server (#1574)
This commit fixes the missing parameter binding for max-len between the input
arguments and wparams.
2023-12-05 23:01:45 +02:00
Finn Voorhees
f0efd0202d
ios : Remove #if arch(arm) check for using Metal (#1561) 2023-12-05 01:14:26 +00:00
Digipom
3c28d1a571
ggml : Fix 32-bit compiler warning (#1575)
Warning about %lu on 32-bit targets. Updated to %zu.
2023-12-03 14:15:28 +00:00
Georgi Gerganov
e369243ebd
ggml : re-enable blas for src0 != F32 (#1583) 2023-12-01 23:57:52 +02:00
Aleksander Andrzejewski
a0ec3fac54
Server : Add support for .vtt format to Whisper server (#1578)
- The code comes from examples/main
- The output mimetype is set to text/vtt

Example usage:
```shell
curl 127.0.0.1:8080/inference \
-H "Content-Type: multipart/form-data" \
-F file="@samples/jfk.wav" \
-F temperature="0.2" \
-F response-format="vtt"
```
2023-11-30 23:44:26 +00:00
Oleg Sidorov
6559b538e5
server : backport .srt output format (#1565)
This commit adds a support of .srt format to Whisper server. The code is
effectively backported from examples/main. The output mimetype is set to
application/x-subrip as per https://en.wikipedia.org/wiki/SubRip.

Example usage:

  curl 127.0.0.1:8080/inference \
    -H "Content-Type: multipart/form-data" \
    -F file="@<file-path>" \
    -F temperature="0.2" \
    -F response-format="srt"
2023-11-28 15:42:58 +02:00
Gregor Jasny
73d5005880
cmake : install required ggml.h header (#1568) 2023-11-28 15:41:49 +02:00
Kasumi
6b094b6dfe
server : set default CORS headers to allow all (#1567) 2023-11-28 11:55:20 +02:00
Hang
641f2f4282
readme : update help (#1560) 2023-11-27 12:04:08 +02:00
bobqianic
bfacd9f8ce
CI : Add CUDA 11.8.0 support (#1554)
* try to fix cublas build in CI

* add multiple cuda-toolkit version

* Update build.yml

* Disable CUDA-toolkit 10.2.89
2023-11-27 12:03:16 +02:00
bobqianic
f52e74d4dc
CI : Rectify the Clang-Related workflow issues (#1551)
* fix bugs in workflow

* fix missing clang in workflow

* Update build.yml
2023-11-27 11:35:37 +02:00
Ismatulla Mansurov
23c21e92eb
server : automatically convert audio on the server (#1539)
* server : automatically convert audio on the server

* server : remove rebundant comments

* server : automatic conversion refactor

* server : update server readme

* server : remove unnecessary comments and tabs

* server : put back remove calling

* server : apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* server : check ffmpeg before the server lunch

* server : fix indentation

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* server : fix function typo calling

* server : fix function typo calling

* server : add warning in readme

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-27 11:28:34 +02:00
Georgi Gerganov
447d49530c
whisper : remove trailing whitespaces 2023-11-24 13:13:21 +02:00
Georgi Gerganov
9d6ebd877c
release : v1.5.1 2023-11-24 12:41:55 +02:00
Georgi Gerganov
0ba365f958
metal : add backend function to check device family support (#1547) 2023-11-24 12:37:08 +02:00
Georgi Gerganov
010c8ec3ab
cuda : sync some minor stuff from llama.cpp (#1548) 2023-11-24 12:36:21 +02:00
Georgi Gerganov
ffdb5c4735
whisper : fix typo 2023-11-24 09:45:10 +02:00
ecneladis
a5881d619c
server : add --print-realtime param (#1541)
* server : add --print-realtime param

* Fix duplicate realtime output
2023-11-24 09:35:02 +02:00
bradmit
34f70b3a56
whisper : add whisper_lang_str_full (#1546)
* Update whisper.h

add whisper_lang_fullstr to retrieve the full language name

* Update whisper.cpp

add whisper_lang_fullstr to return the full language name

* fullstr -> str_full

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-24 09:33:13 +02:00
Okabintaro
8328d1900f
fix(server): typo in temperature parameter (#1545)
Also fixed another typo in comments.
2023-11-23 20:59:36 +02:00
sandrohanea
d2bd5f0bdc
metal : fix build (#1544) 2023-11-23 20:20:53 +02:00
Georgi Gerganov
34209a37a2
readme : add server example 2023-11-23 17:20:33 +02:00
Gleicon Moraes
180e062eda
go : fixed Makefile for MacOS ARM 64 (#1530)
* Fixed Makefile for MacOS ARM 64 based on https://github.com/ggerganov/whisper.cpp/issues/1344 + proper ggml-metal env var setting

* conditional to fix broken non-macos compilation

* spaces -> tab

* make : fix whitespaces

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-22 18:08:11 +02:00
Felix
5c7be85fdc
Change temp file name for server application (#1535)
Avoid issue of removing file if it exists in the current working
directory
2023-11-22 09:23:36 +01:00
Georgi Gerganov
146169ec38
bench : pass memcpy threads from cli 2023-11-21 22:27:22 +02:00
Georgi Gerganov
9befab5ab9
bench : multi-thread memcpy (#1534) 2023-11-21 22:07:30 +02:00
Felix
9ac88f2b57
Close file after writing in server application (#1533)
Fix of mistake leaving file open while reading it again as wav
2023-11-21 20:36:10 +01:00
Georgi Gerganov
46f5b6cb08
server : add video to readme 2023-11-21 17:30:43 +02:00
Felix
eff3570f78
server : add a REST Whisper server example with OAI-like API (#1380)
* Add first draft of server

* Added json support and base funcs for server.cpp

* Add more user input via api-request

also some clean up

* Add reqest params and load post function

Also some general clean up

* Remove unused function

* Add readme

* Add exception handlers

* Update examples/server/server.cpp

* make : add server target

* Add magic curl syntax

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-20 21:40:24 +02:00
M. A. Ali
fa19bc4195
whisper : update example in whisper.h (#1529)
update the example in the header, previous examples deprecated.
2023-11-20 20:52:27 +02:00
Georgi Gerganov
a01b2e0971
sdl : fix audio callback (#1523) 2023-11-20 13:16:38 +02:00
Georgi Gerganov
8159a9ab99
whisper : reuse whisper_decode_with_state (#1521) 2023-11-20 13:16:11 +02:00
Tamotsu Takahashi
7516d9c16d
ci : redistribute CUDA DLLs (#1522)
see https://docs.nvidia.com/cuda/eula/index.html#attachment-a
2023-11-19 12:43:22 +02:00
sandrohanea
46cc26d1b9
whisper : fix with_state methods to use the correct state (#1519)
Co-authored-by: Sandro Hanea <sandrohanea@microsoft.com>
2023-11-19 11:25:30 +02:00
Georgi Gerganov
f784f9fa12
whisper : fix overriding the audio context 2023-11-19 10:32:32 +02:00
Georgi Gerganov
ca23f8ee6d
cuda : assert ggml_add sources to be contiguous 2023-11-19 10:32:08 +02:00
Georgi Gerganov
e2f0eba2d4
ios : sync submodule 2023-11-17 10:42:04 +02:00
Georgi Gerganov
d4353e48f7
sync : ggml (ggml-alloc + linker + gguf fixes) (#1501) 2023-11-17 10:00:07 +02:00
Georgi Gerganov
bebf0da983
quantize : add support for K-quant types 2023-11-16 16:18:24 +02:00
Georgi Gerganov
848e54f3ad
bench : fix memcpy bench size 2023-11-16 10:59:32 +02:00
Sam Pullara
7883d1cae4
talk-llama : improve quote and backtick handling (#1364)
* ISSUE-1329: replace " with ' so it doesn't try to execute code in backticks.

* Typo

* Update to keep possessives in the output

Closes the ' then puts a ' in quotes then reopens the ' to escape the ' characters.
2023-11-16 10:34:05 +02:00
Georgi Gerganov
ccc85b4ff8
talk-llama : enable GPU by default 2023-11-15 21:33:00 +02:00