Commit Graph

753 Commits

Author SHA1 Message Date
Georgi Gerganov
2b4160af29
whisper : add description of ggml_mul_mat_pad 2023-09-14 15:37:10 +03:00
Georgi Gerganov
f36554382a
whisper : add comment for disabling mul-mat padding 2023-09-14 15:25:19 +03:00
Georgi Gerganov
c46167f8c5
bench : fix uninitialized vars 2023-09-14 15:19:27 +03:00
Georgi Gerganov
af947cb72e
whisper : add ggml_mul_mat_pad 2023-09-14 15:16:22 +03:00
Georgi Gerganov
e81c67a125
bench : start benching the decoder 2023-09-14 10:06:14 +03:00
Georgi Gerganov
f408c64564
bench : fix timings by running a pre-heat 2023-09-13 23:03:25 +03:00
Georgi Gerganov
d863f725a1
coreml : add code to toggle Core ML config (CPU, ANE, GPU) 2023-09-13 22:51:10 +03:00
Georgi Gerganov
d37f56e7a9
ios : update submodule 2023-09-13 21:31:29 +03:00
Georgi Gerganov
23277d21ce
readme : add Metal info 2023-09-13 20:54:03 +03:00
Georgi Gerganov
ecb23fb1eb
metal : sync latest llama.cpp kernels 2023-09-13 20:44:05 +03:00
Georgi Gerganov
8e8daa8451
metal : speed-up KQ multiplication 2023-09-13 19:59:16 +03:00
Georgi Gerganov
16db4da3f1
swiftui : fix build 2023-09-13 19:49:11 +03:00
Georgi Gerganov
257d7942af
ios : add Metal support 2023-09-13 19:45:12 +03:00
Georgi Gerganov
181bb8cb28
objc : fix build (no Metal yet) 2023-09-13 18:54:41 +03:00
Georgi Gerganov
796f84cd95
whisper : add <functional> header 2023-09-13 13:35:42 +03:00
Georgi Gerganov
77f4bf49c8
cmake : update to support Metal build 2023-09-13 13:34:51 +03:00
Georgi Gerganov
b6f09669a2
whisper : factor out alloc init in a function 2023-09-13 12:51:52 +03:00
Georgi Gerganov
254b687239
whisper : add whisper_allocr to wrap ggml_allocr 2023-09-13 11:58:19 +03:00
Georgi Gerganov
b19888cfb4
ggml-alloc : try to make CI happy by reducing vram to 128GB 2023-09-13 11:57:46 +03:00
Georgi Gerganov
905c944143
ggml : use simpler ggml_bytes() implementation 2023-09-13 11:39:09 +03:00
Georgi Gerganov
3074a7ff14
whisper : offload the Encoder to Metal 2023-09-13 00:09:44 +03:00
Georgi Gerganov
ec9a7db74c
whisper : remove ggml_repeat in the encoder 2023-09-12 20:34:32 +03:00
Georgi Gerganov
cd476375b4
metal : run "cross" step on the GPU 2023-09-12 20:11:13 +03:00
Georgi Gerganov
9fdd415367
ggml : fix ggml_nbytes (probably temp solution) 2023-09-12 20:10:53 +03:00
Georgi Gerganov
79a88057bd
metal : add multi-decoder support 2023-09-12 19:33:29 +03:00
Georgi Gerganov
fbc9ddc582
metal : decoder works on GPU! 2023-09-12 19:23:30 +03:00
Georgi Gerganov
3b9979a373
ci : try to debug vmem issue 2023-09-12 14:08:48 +03:00
Georgi Gerganov
de94c783ee
Merge branch 'master' into metal-and-alloc 2023-09-12 14:02:43 +03:00
Georgi Gerganov
3fec2119e6
whisper : fix bench regression + fix performance when using CPU BLAS (#1275)
* whisper : fix bench regression

* ggml : use sched_yield when using BLAS + add comment
2023-09-12 13:54:04 +03:00
Georgi Gerganov
d3b2dd4955
whisper : initial Metal version 2023-09-11 16:23:31 +03:00
Georgi Gerganov
4845b9ed09
whisper.android : try to fix build 2023-09-11 15:19:21 +03:00
Georgi Gerganov
2770d46ef5
whisper : refactor ggml-alloc init 2023-09-11 15:04:33 +03:00
Georgi Gerganov
4d9acc60c3
ci : see if this is causing the crash 2023-09-11 14:42:25 +03:00
Georgi Gerganov
06d1d2836b
extra : update sync-ggml.sh script to also sync ggml-alloc 2023-09-10 22:45:38 +03:00
Georgi Gerganov
9a78b72246
ios : update submodule 2023-09-10 22:36:50 +03:00
Georgi Gerganov
794e8fe0ea
build : fix ggml-alloc 2023-09-10 22:19:39 +03:00
Georgi Gerganov
fa672b46e6
whisper : CoreML support ggml-alloc 2023-09-10 21:57:04 +03:00
Georgi Gerganov
af6f67b251
whisper : ggml-alloc is now supported 2023-09-10 20:09:17 +03:00
Georgi Gerganov
bed5ad69dd
whisper : allocate encoder and decoder using ggml-alloc 2023-09-10 19:50:34 +03:00
Georgi Gerganov
949ab6328d
whisper : factor out graph builds 2023-09-10 19:23:06 +03:00
Georgi Gerganov
fbc3f8033e
metal : init 2023-09-10 18:38:34 +03:00
bobqianic
9b14418863
whisper : faster beam_search sampling via reduced KV cache copies (#1243)
* Faster `beam_search` sampling

Refine the KV cache update logic for more intelligent and efficient updating.

* Faster `whisper_sample_token_topk`

* Update whisper.cpp

* Update whisper.cpp

* Update whisper.cpp

* Reduce `memory allocation`

* Add `pointer swapping`

* Fixed some bugs

* Update whisper.cpp

* Apply suggestions from code review

* Updated the logic for determining `two-copy`

* Updated the logic for determining `two-copy` v2

* whisper : add debug logs + coding style

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-10 16:04:27 +03:00
Nicholas Albion
6ddc727fac
java : fixed signing of java artifact using gradle (#1267)
* --stacktrace signMavenJavaPublication

* added temporary step "Debug gradle signing"

* cd bindings/java

* use GPG_PRIVATE_KEY and GPG_PASSPHRASE

* use secrets.GPG_PRIVATE_KEY and GPG_PASSPHRASE
2023-09-09 18:55:51 +03:00
Georgi Gerganov
acb5278cc8
ci : try to fix gradle action (#1265) 2023-09-08 20:50:15 +03:00
Georgi Gerganov
0839209cab
gitignore : update 2023-09-08 19:45:28 +03:00
Georgi Gerganov
b39809668a
sync : ggml (HBM + Metal + style) (#1264) 2023-09-08 17:58:31 +03:00
Georgi Gerganov
3e9edc6845
ci : upgrade gradle to 2.4.2 (#1263)
* ci : upgrade gradle to 2.4.2

* cmake : add comment (#1129)
2023-09-08 17:58:14 +03:00
Georgi Gerganov
bfc73f1fa2
sync : ggml (CUDA faster rope) 2023-09-08 15:01:26 +03:00
Georgi Gerganov
f00c9bba33
cmake : noramlize case (#1129) 2023-09-08 14:50:03 +03:00
Przemysław Pawełczyk
b55b505690
build : do not use _GNU_SOURCE gratuitously (#1129)
* Do not use _GNU_SOURCE gratuitously.

What is needed to build whisper.cpp and examples is availability of
stuff defined in The Open Group Base Specifications Issue 6
(https://pubs.opengroup.org/onlinepubs/009695399/) known also as
Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions,
plus some stuff from BSD that is not specified in POSIX.1.

Well, that was true until NUMA support was added recently in ggml,
so enable GNU libc extensions for Linux builds to cover that.

There is no need to penalize musl libc which simply follows standards.

Not having feature test macros in source code gives greater flexibility
to those wanting to reuse it in 3rd party app, as they can build it with
minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs.

It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2.

* examples : include SDL headers before other headers

Avoid macOS build error when _DARWIN_C_SOURCE is not defined, brought by
SDL2 relying on Darwin extension memset_pattern4/8/16 (from string.h).

* make : enable BSD extensions for DragonFlyBSD to expose RLIMIT_MEMLOCK

* make : use BSD-specific FTMs to enable alloca on BSDs

* make : fix OpenBSD build by exposing newer POSIX definitions

* cmake : follow recent FTM improvements from Makefile
2023-09-07 12:36:14 +03:00