Commit Graph

336 Commits

Author SHA1 Message Date
Georgi Gerganov
1711bb3881
sync : llama.cpp (ggml/0) 2024-02-28 13:00:30 +02:00
Andrew S
0d8fd8483a
stream.wasm : fix invalid memory access when no segments (#1902)
No segments may be returned when a smaller sample buffer (EG 2048 samples) is sent to the worker.
2024-02-26 10:12:35 +02:00
Georgi Gerganov
3170841ed9
talk-llama : sync llama.cpp 2024-02-25 20:00:10 +02:00
Georgi Gerganov
578e47e70c
sync : llama.cpp (ggml/0) 2024-02-25 19:58:46 +02:00
Tamotsu Takahashi
f18738f247
talk, talk-llama : pass text_to_speak as a file (#1865)
* talk-llama: pass file instead of arg

it is too hard to quote text in a portable way

* talk-llama: pass heard_ok as a file

* talk-llama: let eleven-labs.py accept options

Options: -v voice, -s savefile, -p (--play)

* talk-llama: check installed commands in "speak"

Pass "-q" to eleven-labs.py to skip checking whether elevenlabs is installed

* talk-llama: pass voice_id again

in order to sync talk with talk-llama

* talk: sync with talk-llama

Passing text_to_speak as a file is safer and more portable
cf. https://stackoverflow.com/a/59036879/45375

* talk and talk-llama: get all installed voices in speak.ps1

* talk and talk-llama: get voices from api

* talk and talk-llama: add more options to eleven-labs.py

and remove DEFAULT_VOICE because it is deprecated (https://www.reddit.com/r/ElevenLabs/comments/1830abt/what_happened_to_bella/)

```
usage: eleven-labs.py [-q] [-l] [-h] [-n NAME | -v NUMBER] [-f KEY=VAL] [-s FILE | -p] [TEXTFILE]

options:
  -q, --quick           skip checking the required library

action:
  TEXTFILE              read the text file (default: stdin)
  -l, --list            show the list of voices and exit
  -h, --help            show this help and exit

voice selection:
  -n NAME, --name NAME  get a voice object by name (default: Arnold)
  -v NUMBER, --voice NUMBER
                        get a voice object by number (see --list)
  -f KEY=VAL, --filter KEY=VAL
                        filter voices by labels (default: "use case=narration")
                        this option can be used multiple times
                        filtering will be disabled if the first -f has no "=" (e.g. -f "any")

output:
  -s FILE, --save FILE  save the TTS to a file (default: audio.mp3)
  -p, --play            play the TTS with ffplay
```

* examples: add speak_with_file()

as suggested in the review

* talk and talk-llama: ignore to_speak.txt
2024-02-24 09:24:47 +02:00
Abhilash Majumder
a0ddd8392c
whisper : add SYCL support (#1863)
* add changes from llama upstream

* add sycl abstraction

* add sycl build

* update cmake

* add sycl build config

* fix bug

* fix bug

* refactor build

* fix bug

* update build

* call build

* use sycl header

* add examples

* add target

* fix typecast in quant.c

* readd fp16 and readme

* fix quant typecast

* add sample

* add readme

* remove cxx file check
2024-02-23 09:22:24 +02:00
Georgi Gerganov
a2506909b1
talk-llama : sync llama.cpp 2024-02-22 23:30:53 +02:00
Georgi Gerganov
5fdb27ff80
ggml : 32-bit arm compat (#1891)
* ggml : 32-bit arm compat

* ggml : add ggml_vqtbl1q_s8 impl

* ggml : cont
2024-02-22 18:31:40 +02:00
Georgi Gerganov
ce411498f6
sync : llama.cpp (ggml/0)
ggml-ci
2024-02-22 15:12:36 +02:00
Davidson Francis
c56344b509
main : fix file existence check in main.cpp (#1889)
In commit dda4b0e of PR #1872, I've introduced a check for the
existence of files before loading the model. However, I haven't
considered the case where whisper.cpp might read from stdin as well,
and in such cases, the checks should ignore the "-" argument as it
does not represent a regular file.

Additionally, this commit removes the usage of 'stat()' in favor of
the recently introduced function 'is_file_exist()' in common.cpp from
PR #1871.

Apologies for the bug introduced in the previous PR and any
inconvenience it may have caused.
2024-02-22 15:01:08 +02:00
Georgi Gerganov
59119f4f20
talk-llama : sync llama.cpp 2024-02-20 12:09:57 +02:00
Georgi Gerganov
83afebe872
common : add IQ1_S (ggml/0)
ggml-ci
2024-02-19 15:53:25 +02:00
Davidson Francis
dda4b0ed06
main : check if input files exist before proceeding (#1872)
Until the most recent commit (3d42463), the main.cpp sample file does
not check whether the input files exist or not. Consequently, the
model is loaded first before reporting whether there was a failure or
not when processing a file. In environments with HDD, this can take
about 50 seconds or more, depending on the loaded model.

This commit addresses this issue by checking in advance whether the
input files exist or not.
2024-02-19 10:51:26 +02:00
Felix
07d04280be
examples : clean up common code (#1871)
move some utility functions into common.h
2024-02-19 10:50:15 +02:00
Georgi Gerganov
551529290d
talk-llama : sync llama.cpp 2024-02-12 10:39:58 +02:00
dscripka
a6fb6ab597
examples : added audio_ctx argument to main and server (#1857)
* added audio_ctx argument to main and server examples

* Better default value

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* better default value (again)

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-12 09:19:07 +02:00
Georgi Gerganov
f273e66dc6
examples : initialize context params properly (#1852) 2024-02-11 16:39:12 +02:00
Georgi Gerganov
02b4c52c12
talk-llama : sync llama.cpp 2024-02-10 10:10:59 +02:00
Valentin Gosu
80e8a2ea39
server : allow CORS request with authorization headers (#1850)
Whisper plugin in Obsidian requires an API key which is
then sent as an authorization header.
However, the presence of an authorization header requires
a CORS Preflight, so both the OPTIONS method and
the Access-Control-Allow-Headers: authorization must be
handled.
2024-02-09 17:42:41 +02:00
Neuman Vong
19f8048139
whisper.android : how to build with CLBlast (#1809)
* FetchContent

* OpenCL

* Documentation and make optional

* Specify GGML build options in build.gradle

* Use gradle properties

* @ggerganov

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* @gpokat

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-09 17:39:05 +02:00
Georgi Gerganov
434b8f3b96
talk-llama : stream response (#1121) 2024-02-06 19:56:12 +02:00
Georgi Gerganov
7a74e929c8
sync : ggml (#0) 2024-01-30 21:30:26 +02:00
JacobLinCool
ae5c4f7340
common : fix wav buffer detection (#1819) 2024-01-30 19:35:08 +02:00
JacobLinCool
baa30bacdb
server : add fields to verbose_json response (#1802)
* server: include additional fields in the verbose_json response as OpenAI does

* server: show request examples on home page

* server: todo note for compression_ratio and no_speech_prob

* server: add simple demo form to the homepage
2024-01-30 14:15:55 +02:00
Georgi Gerganov
e72e4158de
talk-llama : sync llama.cpp 2024-01-28 19:44:10 +02:00
Georgi Gerganov
52cce82493
common : fix input buffer check (#1812) 2024-01-27 17:33:09 +02:00
Georgi Gerganov
ef3c9ed9eb
talk-llama : sync llama.cpp 2024-01-27 17:24:53 +02:00
Michael Rienstra
4bbb60efce
docs : make model options / model install methods clearer (#1806)
* Make models more "discoverable"

* Clean up code block language identifiers

* make 3 options clearer

* undo Prettier formatter change

* docs: `$` shell prompt, consistently

* docs: minor changes
2024-01-26 17:39:54 +02:00
Neuman Vong
d6b9be21d7
whisper.android : return output from benchmarks (#1785)
Benchmarks are failing because JNI expects a jstring and the benchmarks
are missing a return statement (i.e., returning null). The functions
actually build a jstring but don't return it, so this seems to have been
an oversight.

This patch returns the jstring and now the benchmarks run successfully.

Fixes #1783.
2024-01-19 16:17:38 +02:00
Ryan Hitchman
c0329acde8
server : implement "verbose_json" format with token details (#1781)
* examples/server: implement "verbose_json" format with token details.

This is intended to mirror the format of openai's Python
whisper.transcribe() return values.

* server: don't write WAV to a temporary file if not converting

* server: use std::lock_guard instead of manual lock/unlock
2024-01-18 22:58:42 +02:00
Georgi Gerganov
1f50a7d29f
sync : llama.cpp 2024-01-17 21:23:33 +02:00
Benjamin Heiniger
f6614155e4
talk-llama : optional wake-up command and audio confirmation (#1765)
* talk-llama: add optional wake-word detection from command

* talk-llama: add optional audio confirmation before generating answer

* talk-llama: fix small formatting issue in output

* talk-llama.cpp: fix Windows build
2024-01-16 15:52:01 +02:00
Przemysław Pawełczyk
f5f159c320
server : fix building and simplify lib deps on Windows (#1772)
* make : fix server example building on MSYS2 environments (Windows)

It was not working since commit eff3570f78
when server was introduced.

* cmake : simplify server example lib deps on Windows

server uses httplib::Server, not httplib::SSLServer, so there is no need
to mention cryptographic libraries in target_link_libraries.
Winsock (ws2_32) suffices here.

Also use plain library names like we use in other places.
2024-01-15 15:48:13 +02:00
Georgi Gerganov
6ebba525f1
talk-llama : sync llama.cpp 2024-01-14 18:08:20 +02:00
Georgi Gerganov
2a5874441d
talk-llama : llama.cpp 2024-01-14 11:06:28 +02:00
Georgi Gerganov
d08445c9ad
sync : ggml 2024-01-14 10:55:18 +02:00
Georgi Gerganov
f001a3b7b6
talk-llama : sync llama.cpp 2024-01-14 00:13:17 +02:00
RhinoDevel
db078a9ba8
talk-llama : add optional CLI arg to set the bot name (#1764) 2024-01-13 20:51:35 +02:00
james wolf
a13a7da5ad
examples : add python example for transcription (#1744)
* rebase and add simple python interface

* moved python files to examples/python
2024-01-13 19:37:18 +02:00
Georgi Gerganov
40ae0962f4
talk-llama : sync llama.cpp 2024-01-12 22:04:51 +02:00
George Hindle
fbcb52d3cd
server : add more parameters to server api (#1754)
* feat(server): add more parameters to server api

* fix(server): reset params to original parsed values for each request
2024-01-12 13:42:52 +02:00
George Hindle
f7908f9bb8
params : don't compute timestamps when not printing them (#1755) 2024-01-12 13:24:38 +02:00
Georgi Gerganov
00b7a4be02
talk-llama : sync llama.cpp 2024-01-11 22:10:10 +02:00
Georgi Gerganov
32e71a1861
sync : ggml 2024-01-11 21:54:17 +02:00
Georgi Gerganov
9c857cf280
sync : llama.cpp 2024-01-11 21:50:01 +02:00
RhinoDevel
bcc1658cd0
talk-llama : add optional Piper TTS support (#1749)
Add commented-out command to optionally use Piper (https://github.com/rhasspy/piper) as text-to-speech solution for the talk-llama example. Piper voices sound almost like real people which is a big improvement (e.g.) from something like espeak.
2024-01-10 16:15:28 +02:00
Emmanuel Schmidbauer
c46886f599
server : add request path option(#1741) 2024-01-08 22:39:51 +00:00
Georgi Gerganov
29f78392c1
main : add cli option to disable system prints (#1740) 2024-01-08 16:41:28 +02:00
Georgi Gerganov
022756a872
server : fix server temperature + add temperature_inc (#1729)
* server : fix server temperature + add temperature_inc

* server : change dashes to underscores in parameter names
2024-01-07 13:35:14 +02:00
Georgi Gerganov
3b8c2dff57
talk-llama : sync latest llama.cpp 2024-01-06 17:22:57 +02:00