Commit Graph

786 Commits

Author SHA1 Message Date
Georgi Gerganov
a2f3b82db3
whisper : free backend instances in whisper_state 2023-11-12 14:31:51 +02:00
Georgi Gerganov
76c8b5235b
whisper : fix multi-state Metal 2023-11-12 14:24:45 +02:00
Georgi Gerganov
d029784fb0
whisper : try to fix the parallel whisper_state functionality 2023-11-11 18:37:14 +02:00
Georgi Gerganov
40c66036b6
whisper : fix UB with measure buffers 2023-11-11 18:35:23 +02:00
Georgi Gerganov
fc8565d0e2
whisper : fixes 2023-11-11 17:39:30 +02:00
Georgi Gerganov
b618229340
whisper : factor out graph compute in common function 2023-11-11 17:06:21 +02:00
Georgi Gerganov
b27726da93
whisper : add note that ggml_mul_mat_pad does not work with CUDA 2023-11-11 13:04:58 +02:00
Georgi Gerganov
0867e696a7
whisper : avoid whisper_model_data wrapper 2023-11-11 11:46:54 +02:00
Georgi Gerganov
66bb2e9401
ggml : im2col opts 2023-11-11 10:41:00 +02:00
Georgi Gerganov
3bfc43e3e3
quantize-all : fix 2023-11-10 23:33:40 +02:00
Georgi Gerganov
f53e1388f5
whisper : clean-up 2023-11-10 22:31:44 +02:00
Georgi Gerganov
933c5bef97
whisper : support ggml_conv with CUDA and Metal (#1473)
* ggml : add CUDA support for ggml_conv

* whisper : remove ggml_repeat for conv bias + single backend

* cuda : fix im2col kernel

* metal : add im2col support + mul mat-vec f16 x f16

* bench-all : add q4 models
2023-11-10 22:26:50 +02:00
Georgi Gerganov
c99e290a7f
talk : fix compile warning 2023-11-10 13:54:02 +02:00
Georgi Gerganov
728e1785f0
Merge branch 'master' into ggml-backend-no-sched 2023-11-10 13:51:31 +02:00
Ben Nortier
ec7a6f04f9
whisper : return with error from whisper_encode_internal and whisper_decode_internal when abort callback is true (#1456)
Co-authored-by: Ben Nortier <ben@bjnortier.com>
2023-11-10 13:51:16 +02:00
Georgi Gerganov
d6dad64fbf
make : clean-up 2023-11-10 13:45:07 +02:00
Georgi Gerganov
a54d8c9dec
whisper : fix CoreML 2023-11-10 13:24:06 +02:00
Georgi Gerganov
0ab5025316
Merge branch 'master' into ggml-backend-no-sched 2023-11-10 13:21:47 +02:00
Georgi Gerganov
3f5c1b7ee0
whisper : print when CUDA is enabled 2023-11-10 13:17:02 +02:00
Georgi Gerganov
12030358ee
whisper : free backends + fix compile warning 2023-11-10 12:45:26 +02:00
Georgi Gerganov
dcf9511dbb
whisper : fix beam-search with CUDA 2023-11-10 12:41:11 +02:00
Georgi Gerganov
3dfbe64911
whisper : fix tensor allocation during load 2023-11-10 11:51:55 +02:00
Georgi Gerganov
7e01486b61
whisper : fix logit reading 2023-11-10 11:02:29 +02:00
Georgi Gerganov
659757329d
whisper : migrate to ggml-backend 2023-11-10 10:54:06 +02:00
Jakub Ráček
37947203e6
talk-llama : add language auto detect (#1467)
* Add '-l auto' to talk-llama example

* Update examples/talk-llama/talk-llama.cpp

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-09 19:21:44 +02:00
bobqianic
953419c69a
openvino : update convert-whisper-to-openvino.py to support v3 (#1459) 2023-11-09 12:42:39 +02:00
Xiao-Yong Jin
0de8582f65
coreml : use the correct n_mel value (#1458) 2023-11-08 20:01:41 +00:00
Ben Nortier
baeb733691
whisper : reset mel time when resetting timings (#1452)
Co-authored-by: Ben Nortier <ben@bjnortier.com>
2023-11-08 15:52:23 +02:00
Sindre Sorhus
d03c60dd7f
ios : add support for Swift Package Manager (#1370)
* Add support for Swift

* Make it build in Xcode

* Use the SPM package in the SwiftUI example app
2023-11-07 23:53:31 +02:00
Georgi Gerganov
6a5d195109
release : v1.4.3 2023-11-07 16:15:48 +02:00
Georgi Gerganov
0cbef75422
ggml : fix MIN / MAX macro re-definition 2023-11-07 16:08:46 +02:00
Georgi Gerganov
2cdfc4e025
whisper : add support for large v3 (#1444)
* whisper : add support for large v3

* bench : fix build + fix go bindings

* bench : fix n_mels

* models : update readme
2023-11-07 15:30:18 +02:00
Tobrun
973111088b
android : decouple example into a library and app module (#1445) 2023-11-07 14:27:33 +02:00
Ben Nortier
11b503055e
whisper : reset ctx->t_start_us when calling whisper_reset_timings() (#1434)
Co-authored-by: Ben Nortier <ben@bjnortier.com>
2023-11-07 11:04:32 +02:00
Georgi Gerganov
b629d2d4fe
cmake : fix talk-llama build 2023-11-07 11:03:21 +02:00
Georgi Gerganov
3bd7d48f51
metal : fix asserts for setThreadgroupMemoryLength (close #1435) 2023-11-07 11:02:16 +02:00
iamthad
435a6b74e3
ci : fix variable names in GitHub actions config (#1440)
* Remove _SUPPORT from variables

* Change blasdir to OPENBLAS_PATH

* Update OpenBLAS URLs
2023-11-07 10:53:24 +02:00
Jhen-Jie Hong
75dc800d21
talk-llama : fix n_gpu_layers usage again (#1442) 2023-11-07 10:51:27 +02:00
Georgi Gerganov
0c91aef2d8
whisper : add missing about callback initializers 2023-11-07 10:49:51 +02:00
Jhen-Jie Hong
3989b29a9b
examples : fix n_gpu_layers usage in talk-llama (#1441) 2023-11-07 01:36:23 +00:00
Jhen-Jie Hong
0463028bc2
whisper : add context param to disable gpu (#1293)
* whisper : check state->ctx_metal not null

* whisper : add whisper_context_params { use_gpu }

* whisper : new API with params & deprecate old API

* examples : use no-gpu param && whisper_init_from_file_with_params

* whisper.objc : enable metal & disable on simulator

* whisper.swiftui, metal : enable metal & support load default.metallib

* whisper.android : use new API

* bindings : use new API

* addon.node : fix build & test

* bindings : updata java binding

* bindings : add missing whisper_context_default_params_by_ref WHISPER_API for java

* metal : use SWIFTPM_MODULE_BUNDLE for GGML_SWIFT and reuse library load

* metal : move bundle var into block

* metal : use SWIFT_PACKAGE instead of GGML_SWIFT

* style : minor updates

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-06 11:04:24 +02:00
Georgi Gerganov
39cfad0dee
whisper : add support for new distilled Whisper models (#1424)
* whisper : add support for new distilled Whisper models

* whisper : print log when using distilled models
2023-11-05 19:43:45 +02:00
Georgi Gerganov
6d4d0b5b4b
cuda : fix HIPBLAS build 2023-11-05 19:41:15 +02:00
Georgi Gerganov
f96e1c5b78
sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422)
* sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.)

* metal : allow env metal variable to override resource path (#1415)

* Allow env variable to override resource path

* Update ggml-metal.m

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* sync : restore common / main from `master`

* sync : restore whisper from `master`

* talk-llama : update to latest llama.cpp

* ruby : fix build

* ggml : fix 32-bit ARM build

* ggml : fix MIN / MAX macro collisions + update ios bindings

* ggml : fix ifdefs and MIN / MAX again

* exampels : fix Obj-C and Swift examples

* ggml : fix 32-bit ARM compatibility

* ggml : one more attempt to fix 32-bit ARM compat

* whisper : fix support for larger graphs

---------

Co-authored-by: Chris Raethke <codesoda@users.noreply.github.com>
2023-11-03 21:35:05 +02:00
bobqianic
8a2bee6717
models : use absolute paths for the converted model (#1356) 2023-11-03 10:44:27 +02:00
Asad Memon
d445098c8f
talk-llama : move up-to-date demo to top (#1417) 2023-11-02 18:50:13 +02:00
Georgi Gerganov
74de25158e
talk-llama : add an up-to-date demo video 2023-11-02 15:28:48 +02:00
Aarni Koskela
bce49a260e
examples : Implement JSON output for Token-Level data in main (#1358) 2023-10-31 19:54:52 +00:00
WhiteOlivierus
45c87b5481
models : Faster download for models on windows using BitTransfer (#1404) 2023-10-30 19:18:12 +00:00
ai-at-home
dfe4bc6e59
README : Update README in stream to clarify where to compile from (Issue #1400)
* Clarify doc about where to compile from

* Update examples/stream/README.md

* Update examples/stream/README.md

* Update README.md

---------

Co-authored-by: AI @ Home <>
Co-authored-by: bobqianic <129547291+bobqianic@users.noreply.github.com>
2023-10-29 17:11:13 +00:00