Georgi Gerganov
3ac0558009
ios : update SPM package
2023-09-15 12:13:33 +03:00
Georgi Gerganov
a1664574fe
bench : variable n_past
2023-09-14 22:41:41 +03:00
Georgi Gerganov
bfcb2a2ab9
metal : remove the "concurrent" flag
2023-09-14 18:04:42 +03:00
Georgi Gerganov
0d5e4cdc36
whisper : clean-up ggml_mul_mat_pad
2023-09-14 17:28:13 +03:00
Georgi Gerganov
2b4160af29
whisper : add description of ggml_mul_mat_pad
2023-09-14 15:37:10 +03:00
Georgi Gerganov
f36554382a
whisper : add comment for disabling mul-mat padding
2023-09-14 15:25:19 +03:00
Georgi Gerganov
c46167f8c5
bench : fix uninitialized vars
2023-09-14 15:19:27 +03:00
Georgi Gerganov
af947cb72e
whisper : add ggml_mul_mat_pad
2023-09-14 15:16:22 +03:00
Georgi Gerganov
e81c67a125
bench : start benching the decoder
2023-09-14 10:06:14 +03:00
Georgi Gerganov
f408c64564
bench : fix timings by running a pre-heat
2023-09-13 23:03:25 +03:00
Georgi Gerganov
d863f725a1
coreml : add code to toggle Core ML config (CPU, ANE, GPU)
2023-09-13 22:51:10 +03:00
Georgi Gerganov
d37f56e7a9
ios : update submodule
2023-09-13 21:31:29 +03:00
Georgi Gerganov
23277d21ce
readme : add Metal info
2023-09-13 20:54:03 +03:00
Georgi Gerganov
ecb23fb1eb
metal : sync latest llama.cpp kernels
2023-09-13 20:44:05 +03:00
Georgi Gerganov
8e8daa8451
metal : speed-up KQ multiplication
2023-09-13 19:59:16 +03:00
Georgi Gerganov
16db4da3f1
swiftui : fix build
2023-09-13 19:49:11 +03:00
Georgi Gerganov
257d7942af
ios : add Metal support
2023-09-13 19:45:12 +03:00
Georgi Gerganov
181bb8cb28
objc : fix build (no Metal yet)
2023-09-13 18:54:41 +03:00
Georgi Gerganov
796f84cd95
whisper : add <functional> header
2023-09-13 13:35:42 +03:00
Georgi Gerganov
77f4bf49c8
cmake : update to support Metal build
2023-09-13 13:34:51 +03:00
Georgi Gerganov
b6f09669a2
whisper : factor out alloc init in a function
2023-09-13 12:51:52 +03:00
Georgi Gerganov
254b687239
whisper : add whisper_allocr to wrap ggml_allocr
2023-09-13 11:58:19 +03:00
Georgi Gerganov
b19888cfb4
ggml-alloc : try to make CI happy by reducing vram to 128GB
2023-09-13 11:57:46 +03:00
Georgi Gerganov
905c944143
ggml : use simpler ggml_bytes() implementation
2023-09-13 11:39:09 +03:00
Georgi Gerganov
3074a7ff14
whisper : offload the Encoder to Metal
2023-09-13 00:09:44 +03:00
Georgi Gerganov
ec9a7db74c
whisper : remove ggml_repeat in the encoder
2023-09-12 20:34:32 +03:00
Georgi Gerganov
cd476375b4
metal : run "cross" step on the GPU
2023-09-12 20:11:13 +03:00
Georgi Gerganov
9fdd415367
ggml : fix ggml_nbytes (probably temp solution)
2023-09-12 20:10:53 +03:00
Georgi Gerganov
79a88057bd
metal : add multi-decoder support
2023-09-12 19:33:29 +03:00
Georgi Gerganov
fbc9ddc582
metal : decoder works on GPU!
2023-09-12 19:23:30 +03:00
Georgi Gerganov
3b9979a373
ci : try to debug vmem issue
2023-09-12 14:08:48 +03:00
Georgi Gerganov
de94c783ee
Merge branch 'master' into metal-and-alloc
2023-09-12 14:02:43 +03:00
Georgi Gerganov
3fec2119e6
whisper : fix bench regression + fix performance when using CPU BLAS ( #1275 )
...
* whisper : fix bench regression
* ggml : use sched_yield when using BLAS + add comment
2023-09-12 13:54:04 +03:00
Georgi Gerganov
d3b2dd4955
whisper : initial Metal version
2023-09-11 16:23:31 +03:00
Georgi Gerganov
4845b9ed09
whisper.android : try to fix build
2023-09-11 15:19:21 +03:00
Georgi Gerganov
2770d46ef5
whisper : refactor ggml-alloc init
2023-09-11 15:04:33 +03:00
Georgi Gerganov
4d9acc60c3
ci : see if this is causing the crash
2023-09-11 14:42:25 +03:00
Georgi Gerganov
06d1d2836b
extra : update sync-ggml.sh script to also sync ggml-alloc
2023-09-10 22:45:38 +03:00
Georgi Gerganov
9a78b72246
ios : update submodule
2023-09-10 22:36:50 +03:00
Georgi Gerganov
794e8fe0ea
build : fix ggml-alloc
2023-09-10 22:19:39 +03:00
Georgi Gerganov
fa672b46e6
whisper : CoreML support ggml-alloc
2023-09-10 21:57:04 +03:00
Georgi Gerganov
af6f67b251
whisper : ggml-alloc is now supported
2023-09-10 20:09:17 +03:00
Georgi Gerganov
bed5ad69dd
whisper : allocate encoder and decoder using ggml-alloc
2023-09-10 19:50:34 +03:00
Georgi Gerganov
949ab6328d
whisper : factor out graph builds
2023-09-10 19:23:06 +03:00
Georgi Gerganov
fbc3f8033e
metal : init
2023-09-10 18:38:34 +03:00
bobqianic
9b14418863
whisper : faster beam_search sampling via reduced KV cache copies ( #1243 )
...
* Faster `beam_search` sampling
Refine the KV cache update logic for more intelligent and efficient updating.
* Faster `whisper_sample_token_topk`
* Update whisper.cpp
* Update whisper.cpp
* Update whisper.cpp
* Reduce `memory allocation`
* Add `pointer swapping`
* Fixed some bugs
* Update whisper.cpp
* Apply suggestions from code review
* Updated the logic for determining `two-copy`
* Updated the logic for determining `two-copy` v2
* whisper : add debug logs + coding style
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-10 16:04:27 +03:00
Nicholas Albion
6ddc727fac
java : fixed signing of java artifact using gradle ( #1267 )
...
* --stacktrace signMavenJavaPublication
* added temporary step "Debug gradle signing"
* cd bindings/java
* use GPG_PRIVATE_KEY and GPG_PASSPHRASE
* use secrets.GPG_PRIVATE_KEY and GPG_PASSPHRASE
2023-09-09 18:55:51 +03:00
Georgi Gerganov
acb5278cc8
ci : try to fix gradle action ( #1265 )
2023-09-08 20:50:15 +03:00
Georgi Gerganov
0839209cab
gitignore : update
2023-09-08 19:45:28 +03:00
Georgi Gerganov
b39809668a
sync : ggml (HBM + Metal + style) ( #1264 )
2023-09-08 17:58:31 +03:00