whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2024-12-22 05:57:48 +00:00

History

Xuan Son Nguyen 8807fe608b Refactor lora adapter support (llama/8332) * lora: load to devide buft * add patch tensor function * correct tensor patch * llama_lora_adapter_apply * correct ggml_backend_tensor_copy * add llm_build_mm * fix auto merge * update based on review comments * add convert script * no more transpose A * add f16 convert * add metadata check * add sanity check * fix ftype * add requirements * fix requirements * fix outfile * conversion: only allow selected models * fix types * cuda : do not use dmmv if the tensor does not have enough cols * llama : lora fixes * do not disable mmap with lora Co-authored-by: slaren <slarengh@gmail.com> * llm_build_lora_mm_id * convert_lora : MoE LoRA conversion support * convert_lora : prefer safetensors, similarly to convert_hf * convert_hf : simplify modify_tensors for InternLM2 * convert_lora : lazy conversion * llama : load and use alpha from LoRA adapters * llama : use llm_build_lora_mm in most model graphs * auto scale * Revert "auto scale" This reverts commit 42415a4874e0f963e4aca6796ea5dfb97cd17464. * remove redundant params * Apply suggestions from code review Co-authored-by: slaren <slarengh@gmail.com> * change kv metadata * move add_type to __init__ * convert_hf : move add_type to main() * convert_lora : use the GGUFWriter from Model instead of overwriting it --------- Co-authored-by: slaren <slarengh@gmail.com> Co-authored-by: Francis Couture-Harpin <git@compilade.net>		2024-08-08 22:48:46 +03:00
..
ggml-cuda	cuda : suppress 'noreturn' warn in no_device_code (llama/8414)	2024-08-08 22:48:46 +03:00
ggml-sycl	add concat through dim 1/2 (llama/8483)	2024-08-08 22:48:46 +03:00
kompute-shaders	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
vulkan-shaders	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
CMakeLists.txt	vulkan : cmake integration (llama/8119)	2024-08-08 22:48:46 +03:00
ggml-alloc.c	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml-backend-impl.h	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml-backend.c	fix the mul_mat_id ut issues (llama/8427)	2024-08-08 22:48:46 +03:00
ggml-blas.cpp	ggml : add NVPL BLAS support (ggml/8329) (llama/8425)	2024-08-08 22:48:46 +03:00
ggml-common.h	ggml : add AArch64 optimized GEMV and GEMM Q4 kernels (llama/5780)	2024-08-08 22:48:46 +03:00
ggml-cuda.cu	Refactor lora adapter support (llama/8332)	2024-08-08 22:48:46 +03:00
ggml-impl.h	ggml : add AArch64 optimized GEMV and GEMM Q4 kernels (llama/5780)	2024-08-08 22:48:46 +03:00
ggml-kompute.cpp	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml-metal.m	metal : template-ify some of the kernels (llama/8447)	2024-08-08 22:48:46 +03:00
ggml-metal.metal	metal : template-ify some of the kernels (llama/8447)	2024-08-08 22:48:46 +03:00
ggml-quants.c	ggml : minor naming changes (llama/8433)	2024-08-08 22:48:46 +03:00
ggml-quants.h	ggml : minor naming changes (llama/8433)	2024-08-08 22:48:46 +03:00
ggml-rpc.cpp	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml-sycl.cpp	add concat through dim 1/2 (llama/8483)	2024-08-08 22:48:46 +03:00
ggml-vulkan-shaders.hpp	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml-vulkan.cpp	Vulkan MMQ Fix (llama/8479)	2024-08-08 22:48:46 +03:00
ggml.c	Refactor lora adapter support (llama/8332)	2024-08-08 22:48:46 +03:00
sgemm.cpp	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
sgemm.h	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00