mirror of
https://github.com/ggerganov/whisper.cpp.git
synced 2024-12-22 05:57:48 +00:00
8807fe608b
* lora: load to devide buft * add patch tensor function * correct tensor patch * llama_lora_adapter_apply * correct ggml_backend_tensor_copy * add llm_build_mm * fix auto merge * update based on review comments * add convert script * no more transpose A * add f16 convert * add metadata check * add sanity check * fix ftype * add requirements * fix requirements * fix outfile * conversion: only allow selected models * fix types * cuda : do not use dmmv if the tensor does not have enough cols * llama : lora fixes * do not disable mmap with lora Co-authored-by: slaren <slarengh@gmail.com> * llm_build_lora_mm_id * convert_lora : MoE LoRA conversion support * convert_lora : prefer safetensors, similarly to convert_hf * convert_hf : simplify modify_tensors for InternLM2 * convert_lora : lazy conversion * llama : load and use alpha from LoRA adapters * llama : use llm_build_lora_mm in most model graphs * auto scale * Revert "auto scale" This reverts commit 42415a4874e0f963e4aca6796ea5dfb97cd17464. * remove redundant params * Apply suggestions from code review Co-authored-by: slaren <slarengh@gmail.com> * change kv metadata * move add_type to __init__ * convert_hf : move add_type to main() * convert_lora : use the GGUFWriter from Model instead of overwriting it --------- Co-authored-by: slaren <slarengh@gmail.com> Co-authored-by: Francis Couture-Harpin <git@compilade.net> |
||
---|---|---|
.. | ||
ggml-cuda | ||
ggml-sycl | ||
kompute-shaders | ||
vulkan-shaders | ||
CMakeLists.txt | ||
ggml-alloc.c | ||
ggml-backend-impl.h | ||
ggml-backend.c | ||
ggml-blas.cpp | ||
ggml-common.h | ||
ggml-cuda.cu | ||
ggml-impl.h | ||
ggml-kompute.cpp | ||
ggml-metal.m | ||
ggml-metal.metal | ||
ggml-quants.c | ||
ggml-quants.h | ||
ggml-rpc.cpp | ||
ggml-sycl.cpp | ||
ggml-vulkan-shaders.hpp | ||
ggml-vulkan.cpp | ||
ggml.c | ||
sgemm.cpp | ||
sgemm.h |