Commit Graph

17 Commits

Author SHA1 Message Date
Georgi Gerganov
794b162a46
whisper : add integer quantization support (#540)
* whisper : add integer quantization support

* examples : add common-ggml + prepare to add "quantize" tool

* whisper : quantization tool ready

* whisper : fix F32 support

* whisper : try to fix shared lib linkage

* wasm : update quantized models to Q5

* bench.wasm : remove "medium" button

* bench.wasm : fix custom model button

* ggml : add Q5_0 and Q5_1 WASM SIMD

* wasm : add quantized models to all WASM examples

* wasm : bump DB version number to 2

* talk-llama : update example to latest llama.cpp

* node : increase test timeout to 10s

* readme : add information for model quantization

* wasm : add links to other examples
2023-04-30 18:51:57 +03:00
Georgi Gerganov
5fd1bdd7fc
whisper : add GPU support via cuBLAS (#834)
* make : add WHISPER_CUBLAS

* make : fix CUBLAS build

* whisper : disable Flash Attention + adjust memory buffers

* whisper : remove old commented code

* readme : add cuBLAS instructions

* cmake : add WHISPER_CUBLAS option

* gitignore : ignore build-cublas
2023-04-30 12:14:33 +03:00
DGdev91
001083a769
talk, talk-llama : add basic example script for eleven-labs tts (#728) 2023-04-14 19:53:58 +03:00
Georgi Gerganov
4a0deb8b1e
talk-llama : add new example + sync ggml from llama.cpp (#664)
* talk-llama : talk with LLaMA AI

* talk.llama : disable EOS token

* talk-llama : add README instructions

* ggml : fix build in debug
2023-03-27 21:00:32 +03:00
Georgi Gerganov
1beff6f66d
models : change HF hosting from dataset to model 2023-03-22 20:44:56 +02:00
Georgi Gerganov
09d7d2b68e
examples : refactor in order to reuse code and reduce duplication (#482)
* examples : refactor common code into a library

* examples : refactor common SDL code into a library

* make : update Makefile to use common libs

* common : fix MSVC M_PI ..

* addon.node : link common lib
2023-02-15 19:28:10 +02:00
Syahmi Azhar
1512545149
whisper : add loader class to allow loading from buffer and others (#353)
* whisper : add loader to allow loading from other than file

* whisper : rename whisper_init to whisper_init_from_file

* whisper : add whisper_init_from_buffer

* android : Delete local.properties

* android : load models directly from assets

* whisper : adding <stddef.h> needed for size_t + code style

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-01-08 13:03:33 +02:00
Andy Maloney
84c6b42e65
cmake : update to 3.19 (#351)
- update from 3.0 (from 2014) to 3.19 (from 2020)
- move some global setting onto the targets (through a cmake include)
2023-01-05 21:22:48 +02:00
Andy Maloney
331c0bbddc
examples : fix memory leak on failure to load gpt2 model (#323) 2022-12-23 20:19:07 +02:00
Andy Maloney
dc90efd504
examples : small code cleanups (#322)
- remove unnecessary initialization of string to ""
- use empty() instead of checking size()
- use emplace_back instead of push_back
- use nullptr instead of NULL
- remove unnecessary call to .data() on string
- use character overload of find_first_of() instead of passing a string
2022-12-23 20:18:51 +02:00
Georgi Gerganov
99da1e5cc8
cmake : enable and fix -Wall -Wextra -Wpedantic C++ warnings 2022-12-19 20:45:08 +02:00
Georgi Gerganov
a613f16aec
talk : improve prompting 2022-12-12 23:44:36 +02:00
Georgi Gerganov
aa6adda26e
talk : make compatible with c++11 (part 2) 2022-12-11 20:34:04 +02:00
Georgi Gerganov
444349f4ec
talk : make compatible with c++11 2022-12-11 20:19:17 +02:00
Georgi Gerganov
85c9ac18b5
Update README.md 2022-12-10 16:54:57 +02:00
Georgi Gerganov
b7c85d1ea6 talk : fix build for MSVC 2022-12-10 16:51:58 +02:00
Georgi Gerganov
3b1aacbe6d talk : talk with AI in the terminal 2022-12-10 16:51:58 +02:00