Commit Graph

23 Commits

Author SHA1 Message Date
Diego Devesa
3daeacad24 ggml : move AMX to the CPU backend (llama/10570)
ggml : automatic selection of best CPU backend (llama/10606)
2024-12-08 20:14:35 +02:00
Diego Devesa
77e3e4a090 ggml : add support for dynamic loading of backends (llama/10469)
* ggml : add support for dynamic loading of backends

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-12-08 20:14:35 +02:00
haopeng
95e8901e71 add cmake rvv support (llama/10411) 2024-12-08 20:14:35 +02:00
Georgi Gerganov
f4c1d7df39 ggml : sync resolve (skip) (#0) 2024-11-20 21:00:08 +02:00
bandoti
339b8e559c Add required ggml-base and backend libs to cmake pkg (llama/10407) 2024-11-20 21:00:08 +02:00
Romain Biessy
8ee767732f sycl : Add option to set the SYCL architecture for all targets (llama/10266)
* Add option to set the SYCL architecture for all targets
* Convert GGML_SYCL_HIP_TARGET to the more generic GGML_SYCL_ARCH option
* Document that setting GGML_SYCL_ARCH can improve the performance
2024-11-20 21:00:08 +02:00
Johannes Gäßler
dcb2922d1d CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318) 2024-11-20 21:00:08 +02:00
Charles Xu
3298916e5e backend cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels (llama/9921)
* backend-cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels

---------

Co-authored-by: Diego Devesa <slarengh@gmail.com>
2024-11-20 21:00:08 +02:00
Diego Devesa
746bf2596f ggml : build backends as libraries (llama/10256)
* ggml : build backends as libraries

---------

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com>
2024-11-20 21:00:08 +02:00
Georgi Gerganov
9f67aab211 metal : opt-in compile flag for BF16 (llama/10218)
* metal : opt-in compile flag for BF16

ggml-ci

* ci : use BF16

ggml-ci

* swift : switch back to v12

* metal : has_float -> use_float

ggml-ci

* metal : fix BF16 check in MSL

ggml-ci
2024-11-15 15:21:04 +02:00
Diego Devesa
1550be79f1 ggml : add ggml-cpu.h to the public headers (llama/10204) 2024-11-15 15:21:04 +02:00
Ma Mingfei
e1936eb2a5 add amx kernel for gemm (llama/8998)
add intel amx isa detection

add vnni kernel for gemv cases

add vnni and amx kernel support for block_q8_0

code cleanup

fix packing B issue

enable openmp

fine tune amx kernel

switch to aten parallel pattern

add error message for nested parallelism

code cleanup

add f16 support in ggml-amx

add amx kernels for QK_K quant formats: Q4_K, Q5_K, Q6_K and IQ4_XS

update CMakeList

update README

fix some compilation warning

fix compiler warning when amx is not enabled

minor change

ggml-ci

move ggml_amx_init from ggml.c to ggml-amx/mmq.cpp

ggml-ci

update CMakeLists with -mamx-tile, -mamx-int8 and -mamx-bf16

ggml-ci

add amx as an ggml-backend

update header file, the old path for immintrin.h has changed to ggml-cpu-impl.h

minor change

update CMakeLists.txt

minor change

apply weight prepacking in set_tensor method in ggml-backend

fix compile error

ggml-ci

minor change

ggml-ci

update CMakeLists.txt

ggml-ci

add march dependency

minor change

ggml-ci

change ggml_backend_buffer_is_host to return false for amx backend

ggml-ci

fix supports_op

use device reg for AMX backend

ggml-ci

minor change

ggml-ci

minor change

fix rebase

set .buffer_from_host_ptr to be false for AMX backend
2024-11-01 10:19:05 +02:00
Georgi Gerganov
1fd78999e8 cmake : do not hide GGML options + rename option (llama/9465)
* cmake : do not hide GGML options

ggml-ci

* build : rename flag GGML_CUDA_USE_GRAPHS -> GGML_CUDA_GRAPHS

for consistency

ggml-ci
2024-09-24 19:45:08 +03:00
Markus Tavenrath
9e715e1b96 Improve Vulkan shader build system (llama/9239)
* Improve Vulkan shader builds system

- Add dependency to vulkan-shaders-gen to rebuild shaders when changing the shader compilation utility.
- Add option to generate debug info for Vulkan shaders to provide shader source to Vulkan shader profiling tools

* remove not required self dependency
2024-09-24 19:45:08 +03:00
Georgi Gerganov
82b5c56f63 sync : vulkan (skip) (llama/0) 2024-08-28 13:22:20 +03:00
Georgi Gerganov
5a62fdb735 cmake : remove unused option GGML_CURL (llama/9011) 2024-08-28 13:22:20 +03:00
wangshuai09
19708df884 cann: update cmake (llama/8765) 2024-08-08 22:48:46 +03:00
R0CKSTAR
e471adcfa5 feat: Support Moore Threads GPU (llama/8383)
* Update doc for MUSA

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* Add GGML_MUSA in Makefile

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* Add GGML_MUSA in CMake

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* CUDA => MUSA

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* MUSA adds support for __vsubss4

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* Fix CI build failure

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

---------

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2024-08-08 22:48:46 +03:00
65a
fb6a835938 cmake : install all ggml public headers (llama/8480)
Co-authored-by: 65a <65a@65a.invalid>
2024-08-08 22:48:46 +03:00
Georgi Gerganov
db0ea7a2f2 ggml : move sgemm sources to llamafile subfolder (llama/8394)
ggml-ci
2024-08-08 22:48:46 +03:00
Borislav Stanimirov
ab8ec9e940 cmake : only enable GGML_NATIVE and x86 flags if not crosscompiling (ggml/885) 2024-08-08 22:48:46 +03:00
slaren
04e7fa6f4f
ggml : add GGML_CUDA_USE_GRAPHS option, restore GGML_CUDA_FORCE_CUBLAS (cmake) (llama/8140) 2024-06-26 23:18:11 +03:00
Georgi Gerganov
e30c679928
whisper : reorganize source code + improve CMake (#2256)
* scripts : update sync [no ci]

* files : reorganize [no ci]

* sync : llama.cpp

* cmake : link math library

* cmake : build normal ggml library

* files : move headers to include

* objc : fix path to ggml-metal.h

* ci : fix WHISPER_CUDA -> GGML_CUDA

* scripts : sync LICENSE [no ci]
2024-06-26 19:34:09 +03:00