LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2024-12-23 06:22:23 +00:00

Author	SHA1	Message	Date
fakezeta	e7cbe32601	feat: Openvino runtime for transformer backend and streaming support for Openvino and CUDA (#1892 ) * fixes #1775 and #1774 Add BitsAndBytes Quantization and fixes embedding on CUDA devices * Manage 4bit and 8 bit quantization Manage different BitsAndBytes options with the quantization: parameter in yaml * fix compilation errors on non CUDA environment * OpenVINO draft First draft of OpenVINO integration in transformer backend * first working implementation * Streaming working * Small fix for regression on CUDA and XPU * use pip version of optimum[openvino] * Update backend/python/transformers/transformers_server.py Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-03-26 23:31:43 +00:00
Ettore Di Giacinto	607586e0b7	fix: downgrade torch (#1902 ) Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-03-26 22:56:02 +01:00
Sebastian.W	b7ffe66219	Enhance autogptq backend to support VL models (#1860 ) * Enhance autogptq backend to support VL models * update dependencies for autogptq * remove redundant auto-gptq dependency * Convert base64 to image_url for Qwen-VL model * implemented model inference for qwen-vl * remove user prompt from generated answer * fixed write image error --------- Co-authored-by: Binghua Wu <bingwu@estee.com>	2024-03-26 18:48:14 +01:00
TwinFin	504f2e8bf4	Update Backend Dependancies (#1797 ) * Update transformers.yml Signed-off-by: TwinFin <57421631+TwinFinz@users.noreply.github.com> * Update transformers-rocm.yml Signed-off-by: TwinFin <57421631+TwinFinz@users.noreply.github.com> * Update transformers-nvidia.yml Signed-off-by: TwinFin <57421631+TwinFinz@users.noreply.github.com> --------- Signed-off-by: TwinFin <57421631+TwinFinz@users.noreply.github.com>	2024-03-05 10:10:00 +00:00
Ludovic Leroux	939411300a	Bump vLLM version + more options when loading models in vLLM (#1782 ) * Bump vLLM version to 0.3.2 * Add vLLM model loading options * Remove transformers-exllama * Fix install exllama	2024-03-01 22:48:53 +01:00
Ettore Di Giacinto	62a02cd1fe	deps(conda): use transformers environment with autogptq (#1555 )	2024-01-06 15:30:53 +01:00
Ettore Di Giacinto	949da7792d	deps(conda): use transformers-env with vllm,exllama(2) (#1554 ) * deps(conda): use transformers with vllm * join vllm, exllama, exllama2, split petals	2024-01-06 13:32:28 +01:00
Ettore Di Giacinto	95eb72bfd3	feat: add 🐸 coqui (#1489 ) * feat: add coqui * docs: update news	2023-12-24 19:38:54 +01:00
Ettore Di Giacinto	939187a129	env(conda): use transformers for vall-e-x (#1481 )	2023-12-23 14:31:34 -05:00
Ettore Di Giacinto	b4b21a446b	feat(conda): share envs with transformer-based backends (#1465 ) * feat(conda): share env between diffusers and bark * Detect if env already exists * share diffusers and petals * tests: add petals * Use smaller model for tests with petals * test only model load on petals * tests(petals): run only load model tests * Revert "test only model load on petals" This reverts commit `111cfa97f1`. * move transformers and sentencetransformers to common env * Share also transformers-musicgen	2023-12-21 08:35:15 +01:00

10 Commits