* Add option to set the SYCL architecture for all targets
* Convert GGML_SYCL_HIP_TARGET to the more generic GGML_SYCL_ARCH option
* Document that setting GGML_SYCL_ARCH can improve the performance
* metal : opt-in compile flag for BF16
ggml-ci
* ci : use BF16
ggml-ci
* swift : switch back to v12
* metal : has_float -> use_float
ggml-ci
* metal : fix BF16 check in MSL
ggml-ci
add intel amx isa detection
add vnni kernel for gemv cases
add vnni and amx kernel support for block_q8_0
code cleanup
fix packing B issue
enable openmp
fine tune amx kernel
switch to aten parallel pattern
add error message for nested parallelism
code cleanup
add f16 support in ggml-amx
add amx kernels for QK_K quant formats: Q4_K, Q5_K, Q6_K and IQ4_XS
update CMakeList
update README
fix some compilation warning
fix compiler warning when amx is not enabled
minor change
ggml-ci
move ggml_amx_init from ggml.c to ggml-amx/mmq.cpp
ggml-ci
update CMakeLists with -mamx-tile, -mamx-int8 and -mamx-bf16
ggml-ci
add amx as an ggml-backend
update header file, the old path for immintrin.h has changed to ggml-cpu-impl.h
minor change
update CMakeLists.txt
minor change
apply weight prepacking in set_tensor method in ggml-backend
fix compile error
ggml-ci
minor change
ggml-ci
update CMakeLists.txt
ggml-ci
add march dependency
minor change
ggml-ci
change ggml_backend_buffer_is_host to return false for amx backend
ggml-ci
fix supports_op
use device reg for AMX backend
ggml-ci
minor change
ggml-ci
minor change
fix rebase
set .buffer_from_host_ptr to be false for AMX backend
* Improve Vulkan shader builds system
- Add dependency to vulkan-shaders-gen to rebuild shaders when changing the shader compilation utility.
- Add option to generate debug info for Vulkan shaders to provide shader source to Vulkan shader profiling tools
* remove not required self dependency
* Update doc for MUSA
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* Add GGML_MUSA in Makefile
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* Add GGML_MUSA in CMake
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* CUDA => MUSA
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* MUSA adds support for __vsubss4
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* Fix CI build failure
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>