Commit Graph

1290 Commits

Author SHA1 Message Date
74de25158e talk-llama : add an up-to-date demo video 2023-11-02 15:28:48 +02:00
bce49a260e examples : Implement JSON output for Token-Level data in main (#1358) 2023-10-31 19:54:52 +00:00
45c87b5481 models : Faster download for models on windows using BitTransfer (#1404) 2023-10-30 19:18:12 +00:00
dfe4bc6e59 README : Update README in stream to clarify where to compile from (Issue #1400)
* Clarify doc about where to compile from

* Update examples/stream/README.md

* Update examples/stream/README.md

* Update README.md

---------

Co-authored-by: AI @ Home <>
Co-authored-by: bobqianic <129547291+bobqianic@users.noreply.github.com>
2023-10-29 17:11:13 +00:00
54c978c3a3 binding : Expose the audio_ctx param through the Go binding (#1368)
* expose the audio_ctx param through the go binding

* expose the audio_ctx param to the go binding context
2023-10-15 13:35:06 +01:00
9a7074d4aa README : fix typo (#1362) 2023-10-13 16:53:23 +01:00
a0040f5d12 docker : Add dockerfile for cublas (#1286)
* Create Dockerfile

* Rename Dockerfile to cublas.Dockerfile

* Rename cublas.Dockerfile to .devops/cublas.Dockerfile

---------

Co-authored-by: bobqianic <129547291+bobqianic@users.noreply.github.com>
2023-10-11 11:00:17 +01:00
940cdb1396 whisper : abort callback improvements (#1345)
* whisper : initialize abort_callback to null

* whisper : add example how to use abort_callback
2023-10-08 17:22:24 +03:00
1b775cdd68 cmake : Abort the build if a requested feature could not be configured (#1350) 2023-10-07 20:01:18 +01:00
80bf931668 cmake : Prefer pkg-config while looking for BLAS (#1349) 2023-10-07 15:02:07 +01:00
91c0b23384 models : add conversion scripts from HuggingFace models to CoreML (#1304) 2023-10-04 12:00:25 +03:00
2f668c330e whisper : add abort callback (#1335) 2023-10-04 11:57:55 +03:00
08fa34882f examples : move wav_writer from stream.cpp to common.h (#1317)
* Allocate class on the stack instead of on the heap

* Add class wav_writer

* fix some minor issues

* fix some minor issues

* remove potential misleading API
2023-10-03 22:56:11 +03:00
4037705531 whisper : add missing speaker turn API function for whisper_state (#1330) 2023-10-03 22:55:48 +03:00
c76c11e59c examples: Update the README for Talk - fixing the gpt2 URL (#1334) 2023-10-01 04:21:32 +08:00
9edbd0a204 extra: Add benchmark script implemented in Python (#1298)
* Create bench.py

* Various benchmark results

* Update benchmark script with hardware name, and file checks

* Remove old benchmark results

* Add git shorthash

* Round to 2 digits on calculated floats

* Fix the header reference when sorting results

* FIx order of models

* Parse file name

* Simplify filecheck

* Improve print run print statement

* Use simplified model name

* Update benchmark_results.csv

* Process single or lists of processors and threads

* Ignore benchmark results, dont check in

* Move bench.py to extra folder

* Readme section on how to use

* Move command to correct location

* Use separate list for models that exist

* Handle subprocess error in git short hash check

* Fix filtered models list initialization
2023-09-25 23:45:15 +08:00
707507ff6d Examples: Add save audio to file option in stream.cpp (#1310)
* save the recorded audio to a file

* Alignment -help

* Save the correct audio

* chage to a consistent coding style

* Correct typo

* Update examples/stream/stream.cpp

* Update examples/stream/stream.cpp

* Correct variable misuse

* Update examples/stream/stream.cpp

* Update examples/stream/stream.cpp

* Update examples/stream/stream.cpp

* Update examples/stream/stream.cpp

---------

Co-authored-by: bobqianic <129547291+bobqianic@users.noreply.github.com>
2023-09-22 23:43:21 +08:00
JJ
7e1592d2cd readme: Fix spelling error (#1290)
Fixed branding error:  Javascript to JavaScript
2023-09-21 15:55:33 +08:00
903c9579b8 examples: Update README.md of main.cpp (#1306) 2023-09-18 22:14:36 +08:00
b440ef8c96 binding : fix ruby build by adding missing ggml-alloc (#1305) 2023-09-18 21:15:45 +08:00
700f63a806 bench: fix missing include <cstring> (#1303) 2023-09-18 15:51:10 +08:00
951a119926 whisper : increase tokenizer buffer (close #1259) 2023-09-15 21:11:43 +03:00
1ca4041b86 talk-llama : update to latest llama.cpp 2023-09-15 20:06:31 +03:00
80c1512fd5 sync : ggml (const correctness) 2023-09-15 14:49:56 +03:00
0ac9cefd03 metal : restore matrix x vector f16_f32 kerenls for now 2023-09-15 14:40:41 +03:00
b8432f28f4 metal : add F32 support + update bench output 2023-09-15 13:56:08 +03:00
93935980f8 whisper : Metal and ggml-alloc support (#1270)
* metal : init

* whisper : factor out graph builds

* whisper : allocate encoder and decoder using ggml-alloc

* whisper : ggml-alloc is now supported

* whisper : CoreML support ggml-alloc

* build : fix ggml-alloc

* ios : update submodule

* extra : update sync-ggml.sh script to also sync ggml-alloc

* ci : see if this is causing the crash

* whisper : refactor ggml-alloc init

* whisper.android : try to fix build

* whisper : initial Metal version

* ci : try to debug vmem issue

* metal : decoder works on GPU!

* metal : add multi-decoder support

* ggml : fix ggml_nbytes (probably temp solution)

* metal : run "cross" step on the GPU

* whisper : remove ggml_repeat in the encoder

* whisper : offload the Encoder to Metal

* ggml : use simpler ggml_bytes() implementation

* ggml-alloc : try to make CI happy by reducing vram to 128GB

* whisper : add whisper_allocr to wrap ggml_allocr

* whisper : factor out alloc init in a function

* cmake : update to support Metal build

* whisper : add <functional> header

* objc : fix build (no Metal yet)

* ios : add Metal support

* swiftui : fix build

* metal : speed-up KQ multiplication

* metal : sync latest llama.cpp kernels

* readme : add Metal info

* ios : update submodule

* coreml : add code to toggle Core ML config (CPU, ANE, GPU)

* bench : fix timings by running a pre-heat

* bench : start benching the decoder

* whisper : add ggml_mul_mat_pad

* bench : fix uninitialized vars

* whisper : add comment for disabling mul-mat padding

* whisper : add description of ggml_mul_mat_pad

* whisper : clean-up ggml_mul_mat_pad

* metal : remove the "concurrent" flag

* bench : variable n_past

* ios : update SPM package
2023-09-15 12:18:18 +03:00
3fec2119e6 whisper : fix bench regression + fix performance when using CPU BLAS (#1275)
* whisper : fix bench regression

* ggml : use sched_yield when using BLAS + add comment
2023-09-12 13:54:04 +03:00
9b14418863 whisper : faster beam_search sampling via reduced KV cache copies (#1243)
* Faster `beam_search` sampling

Refine the KV cache update logic for more intelligent and efficient updating.

* Faster `whisper_sample_token_topk`

* Update whisper.cpp

* Update whisper.cpp

* Update whisper.cpp

* Reduce `memory allocation`

* Add `pointer swapping`

* Fixed some bugs

* Update whisper.cpp

* Apply suggestions from code review

* Updated the logic for determining `two-copy`

* Updated the logic for determining `two-copy` v2

* whisper : add debug logs + coding style

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-10 16:04:27 +03:00
6ddc727fac java : fixed signing of java artifact using gradle (#1267)
* --stacktrace signMavenJavaPublication

* added temporary step "Debug gradle signing"

* cd bindings/java

* use GPG_PRIVATE_KEY and GPG_PASSPHRASE

* use secrets.GPG_PRIVATE_KEY and GPG_PASSPHRASE
2023-09-09 18:55:51 +03:00
acb5278cc8 ci : try to fix gradle action (#1265) 2023-09-08 20:50:15 +03:00
0839209cab gitignore : update 2023-09-08 19:45:28 +03:00
b39809668a sync : ggml (HBM + Metal + style) (#1264) 2023-09-08 17:58:31 +03:00
3e9edc6845 ci : upgrade gradle to 2.4.2 (#1263)
* ci : upgrade gradle to 2.4.2

* cmake : add comment (#1129)
2023-09-08 17:58:14 +03:00
bfc73f1fa2 sync : ggml (CUDA faster rope) 2023-09-08 15:01:26 +03:00
f00c9bba33 cmake : noramlize case (#1129) 2023-09-08 14:50:03 +03:00
b55b505690 build : do not use _GNU_SOURCE gratuitously (#1129)
* Do not use _GNU_SOURCE gratuitously.

What is needed to build whisper.cpp and examples is availability of
stuff defined in The Open Group Base Specifications Issue 6
(https://pubs.opengroup.org/onlinepubs/009695399/) known also as
Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions,
plus some stuff from BSD that is not specified in POSIX.1.

Well, that was true until NUMA support was added recently in ggml,
so enable GNU libc extensions for Linux builds to cover that.

There is no need to penalize musl libc which simply follows standards.

Not having feature test macros in source code gives greater flexibility
to those wanting to reuse it in 3rd party app, as they can build it with
minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs.

It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2.

* examples : include SDL headers before other headers

Avoid macOS build error when _DARWIN_C_SOURCE is not defined, brought by
SDL2 relying on Darwin extension memset_pattern4/8/16 (from string.h).

* make : enable BSD extensions for DragonFlyBSD to expose RLIMIT_MEMLOCK

* make : use BSD-specific FTMs to enable alloca on BSDs

* make : fix OpenBSD build by exposing newer POSIX definitions

* cmake : follow recent FTM improvements from Makefile
2023-09-07 12:36:14 +03:00
2818de21ff examples : fix build + compile warnings (close #1256) 2023-09-07 12:33:12 +03:00
aed5d40607 models : add quantum models to download-ggml-model.sh (#1235)
* Add quantized models to download-ggml-model.sh

* Update names in download-ggml-model script to normalized
2023-09-07 12:16:58 +03:00
afa5477d1c whisper.android : bump gradle plugin and dependencies + a lint pass (#1255) 2023-09-07 12:15:59 +03:00
01fcd42431 sign jar for Maven Central repo 2023-09-07 11:45:44 +10:00
f990610776 whisper.android : address ARM's big.LITTLE arch by checking cpu info (#1254)
Addresses https://github.com/ggerganov/whisper.cpp/issues/1248
2023-09-06 18:32:30 +03:00
64cb45fd79 make : fix detection of AVX2 on macOS (#1250) 2023-09-06 18:22:21 +03:00
ace6c12ec6 ggml : posixify pagesize (#1251)
* ggml : use sysconf(_SC_PAGESIZE) instead of getpagesize() derived from BSD

sed -i 's,getpagesize(),sysconf(_SC_PAGESIZE),g' ggml.c

* metal : use sysconf(_SC_PAGESIZE) instead of getpagesize() derived from BSD

sed -i 's,getpagesize(),sysconf(_SC_PAGESIZE),g' ggml-metal.m
2023-09-06 18:19:36 +03:00
cac75be05b configured publishing.repositories 2023-09-06 13:13:36 +10:00
c3f319d7c2 ggml : sync latest llama.cpp (view_src + alloc improvements) (#1247)
* ggml : sync latest llama.cpp (view_src + alloc improvements)

* ggml : fix build
2023-09-05 20:57:27 +03:00
ba3c333611 make : improve cpuinfo handling on x86 hosts (#1238)
* make : simplify and correct x86 ISA extensions detection on the host

It got broken in commit c5f9acf4b7 for Haiku and Mac OS (Intel),
which report CPU features in upper case.

Now we're finding the names in case-insensitive manner and as words.
SSE3 detection has been corrected for Linux, which uses PNI for that
(Prescott New Instructions).

* make : use dmesg.boot in FreeBSD/DragonFlyBSD to detect x86 ISA extensions on the host

* make : enable x86 ISA extensions on the host both in CFLAGS and CXXFLAGS

* make : correct AVX x86 ISA extension detection on macOS (Intel) host

It got broken in commit c5f9acf4b7.  macOS calls it AVX1.0.
2023-09-05 14:58:47 +03:00
59a3d0cb57 ggml : sync (ggml-alloc, GPU, eps, etc.) (#1220)
* ggml : sync (ggml-alloc, GPU, eps, etc.)

* ggml : fix build

* wasm : fix build
2023-09-05 13:54:40 +03:00
6780c98e19 readme : update CMake build commands (#1231)
* Update README.md

* Update README.md: `vcpkg install opencl clblast`

* readme : update build commands

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-05 13:53:34 +03:00
2f52783a08 OSSRH_USERNAME -> JIRA_USER 2023-08-31 14:54:02 +10:00