Commit Graph

754 Commits

Author SHA1 Message Date
Georgi Gerganov
0d5e4cdc36
whisper : clean-up ggml_mul_mat_pad 2023-09-14 17:28:13 +03:00
Georgi Gerganov
2b4160af29
whisper : add description of ggml_mul_mat_pad 2023-09-14 15:37:10 +03:00
Georgi Gerganov
f36554382a
whisper : add comment for disabling mul-mat padding 2023-09-14 15:25:19 +03:00
Georgi Gerganov
c46167f8c5
bench : fix uninitialized vars 2023-09-14 15:19:27 +03:00
Georgi Gerganov
af947cb72e
whisper : add ggml_mul_mat_pad 2023-09-14 15:16:22 +03:00
Georgi Gerganov
e81c67a125
bench : start benching the decoder 2023-09-14 10:06:14 +03:00
Georgi Gerganov
f408c64564
bench : fix timings by running a pre-heat 2023-09-13 23:03:25 +03:00
Georgi Gerganov
d863f725a1
coreml : add code to toggle Core ML config (CPU, ANE, GPU) 2023-09-13 22:51:10 +03:00
Georgi Gerganov
d37f56e7a9
ios : update submodule 2023-09-13 21:31:29 +03:00
Georgi Gerganov
23277d21ce
readme : add Metal info 2023-09-13 20:54:03 +03:00
Georgi Gerganov
ecb23fb1eb
metal : sync latest llama.cpp kernels 2023-09-13 20:44:05 +03:00
Georgi Gerganov
8e8daa8451
metal : speed-up KQ multiplication 2023-09-13 19:59:16 +03:00
Georgi Gerganov
16db4da3f1
swiftui : fix build 2023-09-13 19:49:11 +03:00
Georgi Gerganov
257d7942af
ios : add Metal support 2023-09-13 19:45:12 +03:00
Georgi Gerganov
181bb8cb28
objc : fix build (no Metal yet) 2023-09-13 18:54:41 +03:00
Georgi Gerganov
796f84cd95
whisper : add <functional> header 2023-09-13 13:35:42 +03:00
Georgi Gerganov
77f4bf49c8
cmake : update to support Metal build 2023-09-13 13:34:51 +03:00
Georgi Gerganov
b6f09669a2
whisper : factor out alloc init in a function 2023-09-13 12:51:52 +03:00
Georgi Gerganov
254b687239
whisper : add whisper_allocr to wrap ggml_allocr 2023-09-13 11:58:19 +03:00
Georgi Gerganov
b19888cfb4
ggml-alloc : try to make CI happy by reducing vram to 128GB 2023-09-13 11:57:46 +03:00
Georgi Gerganov
905c944143
ggml : use simpler ggml_bytes() implementation 2023-09-13 11:39:09 +03:00
Georgi Gerganov
3074a7ff14
whisper : offload the Encoder to Metal 2023-09-13 00:09:44 +03:00
Georgi Gerganov
ec9a7db74c
whisper : remove ggml_repeat in the encoder 2023-09-12 20:34:32 +03:00
Georgi Gerganov
cd476375b4
metal : run "cross" step on the GPU 2023-09-12 20:11:13 +03:00
Georgi Gerganov
9fdd415367
ggml : fix ggml_nbytes (probably temp solution) 2023-09-12 20:10:53 +03:00
Georgi Gerganov
79a88057bd
metal : add multi-decoder support 2023-09-12 19:33:29 +03:00
Georgi Gerganov
fbc9ddc582
metal : decoder works on GPU! 2023-09-12 19:23:30 +03:00
Georgi Gerganov
3b9979a373
ci : try to debug vmem issue 2023-09-12 14:08:48 +03:00
Georgi Gerganov
de94c783ee
Merge branch 'master' into metal-and-alloc 2023-09-12 14:02:43 +03:00
Georgi Gerganov
3fec2119e6
whisper : fix bench regression + fix performance when using CPU BLAS (#1275)
* whisper : fix bench regression

* ggml : use sched_yield when using BLAS + add comment
2023-09-12 13:54:04 +03:00
Georgi Gerganov
d3b2dd4955
whisper : initial Metal version 2023-09-11 16:23:31 +03:00
Georgi Gerganov
4845b9ed09
whisper.android : try to fix build 2023-09-11 15:19:21 +03:00
Georgi Gerganov
2770d46ef5
whisper : refactor ggml-alloc init 2023-09-11 15:04:33 +03:00
Georgi Gerganov
4d9acc60c3
ci : see if this is causing the crash 2023-09-11 14:42:25 +03:00
Georgi Gerganov
06d1d2836b
extra : update sync-ggml.sh script to also sync ggml-alloc 2023-09-10 22:45:38 +03:00
Georgi Gerganov
9a78b72246
ios : update submodule 2023-09-10 22:36:50 +03:00
Georgi Gerganov
794e8fe0ea
build : fix ggml-alloc 2023-09-10 22:19:39 +03:00
Georgi Gerganov
fa672b46e6
whisper : CoreML support ggml-alloc 2023-09-10 21:57:04 +03:00
Georgi Gerganov
af6f67b251
whisper : ggml-alloc is now supported 2023-09-10 20:09:17 +03:00
Georgi Gerganov
bed5ad69dd
whisper : allocate encoder and decoder using ggml-alloc 2023-09-10 19:50:34 +03:00
Georgi Gerganov
949ab6328d
whisper : factor out graph builds 2023-09-10 19:23:06 +03:00
Georgi Gerganov
fbc3f8033e
metal : init 2023-09-10 18:38:34 +03:00
bobqianic
9b14418863
whisper : faster beam_search sampling via reduced KV cache copies (#1243)
* Faster `beam_search` sampling

Refine the KV cache update logic for more intelligent and efficient updating.

* Faster `whisper_sample_token_topk`

* Update whisper.cpp

* Update whisper.cpp

* Update whisper.cpp

* Reduce `memory allocation`

* Add `pointer swapping`

* Fixed some bugs

* Update whisper.cpp

* Apply suggestions from code review

* Updated the logic for determining `two-copy`

* Updated the logic for determining `two-copy` v2

* whisper : add debug logs + coding style

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-10 16:04:27 +03:00
Nicholas Albion
6ddc727fac
java : fixed signing of java artifact using gradle (#1267)
* --stacktrace signMavenJavaPublication

* added temporary step "Debug gradle signing"

* cd bindings/java

* use GPG_PRIVATE_KEY and GPG_PASSPHRASE

* use secrets.GPG_PRIVATE_KEY and GPG_PASSPHRASE
2023-09-09 18:55:51 +03:00
Georgi Gerganov
acb5278cc8
ci : try to fix gradle action (#1265) 2023-09-08 20:50:15 +03:00
Georgi Gerganov
0839209cab
gitignore : update 2023-09-08 19:45:28 +03:00
Georgi Gerganov
b39809668a
sync : ggml (HBM + Metal + style) (#1264) 2023-09-08 17:58:31 +03:00
Georgi Gerganov
3e9edc6845
ci : upgrade gradle to 2.4.2 (#1263)
* ci : upgrade gradle to 2.4.2

* cmake : add comment (#1129)
2023-09-08 17:58:14 +03:00
Georgi Gerganov
bfc73f1fa2
sync : ggml (CUDA faster rope) 2023-09-08 15:01:26 +03:00
Georgi Gerganov
f00c9bba33
cmake : noramlize case (#1129) 2023-09-08 14:50:03 +03:00