734 Commits

Author SHA1 Message Date
Georgi Gerganov
905c944143
ggml : use simpler ggml_bytes() implementation 2023-09-13 11:39:09 +03:00
Georgi Gerganov
3074a7ff14
whisper : offload the Encoder to Metal 2023-09-13 00:09:44 +03:00
Georgi Gerganov
ec9a7db74c
whisper : remove ggml_repeat in the encoder 2023-09-12 20:34:32 +03:00
Georgi Gerganov
cd476375b4
metal : run "cross" step on the GPU 2023-09-12 20:11:13 +03:00
Georgi Gerganov
9fdd415367
ggml : fix ggml_nbytes (probably temp solution) 2023-09-12 20:10:53 +03:00
Georgi Gerganov
79a88057bd
metal : add multi-decoder support 2023-09-12 19:33:29 +03:00
Georgi Gerganov
fbc9ddc582
metal : decoder works on GPU! 2023-09-12 19:23:30 +03:00
Georgi Gerganov
3b9979a373
ci : try to debug vmem issue 2023-09-12 14:08:48 +03:00
Georgi Gerganov
de94c783ee
Merge branch 'master' into metal-and-alloc 2023-09-12 14:02:43 +03:00
Georgi Gerganov
3fec2119e6
whisper : fix bench regression + fix performance when using CPU BLAS (#1275)
* whisper : fix bench regression

* ggml : use sched_yield when using BLAS + add comment
2023-09-12 13:54:04 +03:00
Georgi Gerganov
d3b2dd4955
whisper : initial Metal version 2023-09-11 16:23:31 +03:00
Georgi Gerganov
4845b9ed09
whisper.android : try to fix build 2023-09-11 15:19:21 +03:00
Georgi Gerganov
2770d46ef5
whisper : refactor ggml-alloc init 2023-09-11 15:04:33 +03:00
Georgi Gerganov
4d9acc60c3
ci : see if this is causing the crash 2023-09-11 14:42:25 +03:00
Georgi Gerganov
06d1d2836b
extra : update sync-ggml.sh script to also sync ggml-alloc 2023-09-10 22:45:38 +03:00
Georgi Gerganov
9a78b72246
ios : update submodule 2023-09-10 22:36:50 +03:00
Georgi Gerganov
794e8fe0ea
build : fix ggml-alloc 2023-09-10 22:19:39 +03:00
Georgi Gerganov
fa672b46e6
whisper : CoreML support ggml-alloc 2023-09-10 21:57:04 +03:00
Georgi Gerganov
af6f67b251
whisper : ggml-alloc is now supported 2023-09-10 20:09:17 +03:00
Georgi Gerganov
bed5ad69dd
whisper : allocate encoder and decoder using ggml-alloc 2023-09-10 19:50:34 +03:00
Georgi Gerganov
949ab6328d
whisper : factor out graph builds 2023-09-10 19:23:06 +03:00
Georgi Gerganov
fbc3f8033e
metal : init 2023-09-10 18:38:34 +03:00
bobqianic
9b14418863
whisper : faster beam_search sampling via reduced KV cache copies (#1243)
* Faster `beam_search` sampling

Refine the KV cache update logic for more intelligent and efficient updating.

* Faster `whisper_sample_token_topk`

* Update whisper.cpp

* Update whisper.cpp

* Update whisper.cpp

* Reduce `memory allocation`

* Add `pointer swapping`

* Fixed some bugs

* Update whisper.cpp

* Apply suggestions from code review

* Updated the logic for determining `two-copy`

* Updated the logic for determining `two-copy` v2

* whisper : add debug logs + coding style

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-10 16:04:27 +03:00
Nicholas Albion
6ddc727fac
java : fixed signing of java artifact using gradle (#1267)
* --stacktrace signMavenJavaPublication

* added temporary step "Debug gradle signing"

* cd bindings/java

* use GPG_PRIVATE_KEY and GPG_PASSPHRASE

* use secrets.GPG_PRIVATE_KEY and GPG_PASSPHRASE
2023-09-09 18:55:51 +03:00
Georgi Gerganov
acb5278cc8
ci : try to fix gradle action (#1265) 2023-09-08 20:50:15 +03:00
Georgi Gerganov
0839209cab
gitignore : update 2023-09-08 19:45:28 +03:00
Georgi Gerganov
b39809668a
sync : ggml (HBM + Metal + style) (#1264) 2023-09-08 17:58:31 +03:00
Georgi Gerganov
3e9edc6845
ci : upgrade gradle to 2.4.2 (#1263)
* ci : upgrade gradle to 2.4.2

* cmake : add comment (#1129)
2023-09-08 17:58:14 +03:00
Georgi Gerganov
bfc73f1fa2
sync : ggml (CUDA faster rope) 2023-09-08 15:01:26 +03:00
Georgi Gerganov
f00c9bba33
cmake : noramlize case (#1129) 2023-09-08 14:50:03 +03:00
Przemysław Pawełczyk
b55b505690
build : do not use _GNU_SOURCE gratuitously (#1129)
* Do not use _GNU_SOURCE gratuitously.

What is needed to build whisper.cpp and examples is availability of
stuff defined in The Open Group Base Specifications Issue 6
(https://pubs.opengroup.org/onlinepubs/009695399/) known also as
Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions,
plus some stuff from BSD that is not specified in POSIX.1.

Well, that was true until NUMA support was added recently in ggml,
so enable GNU libc extensions for Linux builds to cover that.

There is no need to penalize musl libc which simply follows standards.

Not having feature test macros in source code gives greater flexibility
to those wanting to reuse it in 3rd party app, as they can build it with
minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs.

It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2.

* examples : include SDL headers before other headers

Avoid macOS build error when _DARWIN_C_SOURCE is not defined, brought by
SDL2 relying on Darwin extension memset_pattern4/8/16 (from string.h).

* make : enable BSD extensions for DragonFlyBSD to expose RLIMIT_MEMLOCK

* make : use BSD-specific FTMs to enable alloca on BSDs

* make : fix OpenBSD build by exposing newer POSIX definitions

* cmake : follow recent FTM improvements from Makefile
2023-09-07 12:36:14 +03:00
Georgi Gerganov
2818de21ff
examples : fix build + compile warnings (close #1256) 2023-09-07 12:33:12 +03:00
Neil Chudleigh
aed5d40607
models : add quantum models to download-ggml-model.sh (#1235)
* Add quantized models to download-ggml-model.sh

* Update names in download-ggml-model script to normalized
2023-09-07 12:16:58 +03:00
Digipom
afa5477d1c
whisper.android : bump gradle plugin and dependencies + a lint pass (#1255) 2023-09-07 12:15:59 +03:00
Nicholas Albion
01fcd42431 sign jar for Maven Central repo 2023-09-07 11:45:44 +10:00
Digipom
f990610776
whisper.android : address ARM's big.LITTLE arch by checking cpu info (#1254)
Addresses https://github.com/ggerganov/whisper.cpp/issues/1248
2023-09-06 18:32:30 +03:00
Didzis Gosko
64cb45fd79
make : fix detection of AVX2 on macOS (#1250) 2023-09-06 18:22:21 +03:00
Przemysław Pawełczyk
ace6c12ec6
ggml : posixify pagesize (#1251)
* ggml : use sysconf(_SC_PAGESIZE) instead of getpagesize() derived from BSD

sed -i 's,getpagesize(),sysconf(_SC_PAGESIZE),g' ggml.c

* metal : use sysconf(_SC_PAGESIZE) instead of getpagesize() derived from BSD

sed -i 's,getpagesize(),sysconf(_SC_PAGESIZE),g' ggml-metal.m
2023-09-06 18:19:36 +03:00
Nicholas Albion
cac75be05b configured publishing.repositories 2023-09-06 13:13:36 +10:00
Georgi Gerganov
c3f319d7c2
ggml : sync latest llama.cpp (view_src + alloc improvements) (#1247)
* ggml : sync latest llama.cpp (view_src + alloc improvements)

* ggml : fix build
2023-09-05 20:57:27 +03:00
Przemysław Pawełczyk
ba3c333611
make : improve cpuinfo handling on x86 hosts (#1238)
* make : simplify and correct x86 ISA extensions detection on the host

It got broken in commit c5f9acf4b797 for Haiku and Mac OS (Intel),
which report CPU features in upper case.

Now we're finding the names in case-insensitive manner and as words.
SSE3 detection has been corrected for Linux, which uses PNI for that
(Prescott New Instructions).

* make : use dmesg.boot in FreeBSD/DragonFlyBSD to detect x86 ISA extensions on the host

* make : enable x86 ISA extensions on the host both in CFLAGS and CXXFLAGS

* make : correct AVX x86 ISA extension detection on macOS (Intel) host

It got broken in commit c5f9acf4b797.  macOS calls it AVX1.0.
2023-09-05 14:58:47 +03:00
Georgi Gerganov
59a3d0cb57
ggml : sync (ggml-alloc, GPU, eps, etc.) (#1220)
* ggml : sync (ggml-alloc, GPU, eps, etc.)

* ggml : fix build

* wasm : fix build
2023-09-05 13:54:40 +03:00
布客飞龙
6780c98e19
readme : update CMake build commands (#1231)
* Update README.md

* Update README.md: `vcpkg install opencl clblast`

* readme : update build commands

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-05 13:53:34 +03:00
Nicholas Albion
2f52783a08
OSSRH_USERNAME -> JIRA_USER 2023-08-31 14:54:02 +10:00
Nicholas Albion
7dec9d8cc4
build-root-directory: bindings/java 2023-08-31 12:04:16 +10:00
Georgi Gerganov
fb0a24fba2
ci : enable java package publishing (#1228) 2023-08-31 09:57:43 +10:00
ChangSeok Oh
8e30bf3c02
ggml : fix compilation errors incurred by -Werror (#1227)
The -Werror warning option turns all warnings into errors. This PR makes
the compiler happy to build ggml.c and whisper.cpp with the stricter option.
2023-08-30 22:09:15 +03:00
Jhen-Jie Hong
99d3c105f5
whisper.android : fix cmake multiple libraries build (#1224)
* whisper.android : fix multiple libraries build

* fix flags for default target
2023-08-30 14:45:13 +03:00
Dener Stassun
18e9889418
coreml : wrap inference call in @autoreleasepool to fix memory leak (#1218) 2023-08-29 15:44:38 +03:00
Przemysław Pawełczyk
8e46ba80d3
make : use cpuinfo in MSYS2 to enable x86 ISA extensions on the host (#1216) 2023-08-28 13:28:26 +03:00