LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2024-12-28 16:38:51 +00:00

Author	SHA1	Message	Date
fakezeta	6ef78ef7f6	bugfix: CUDA acceleration not working (#2475 ) * bugfix: CUDA acceleration not working CUDA not working after #2286. Refactored the code to be more polish * Update requirements.txt Missing imports Signed-off-by: fakezeta <fakezeta@gmail.com> * Update requirements.txt Signed-off-by: fakezeta <fakezeta@gmail.com> --------- Signed-off-by: fakezeta <fakezeta@gmail.com>	2024-06-03 22:41:42 +02:00
fakezeta	4a239a4bff	feat(transformers): various enhancements to the transformers backend (#2468 ) update transformers Handle Temperature = 0 as greedy search Handle custom works as stop words Implement KV cache Phi 3 no more requires trust_remote_code: true	2024-06-03 08:52:55 +02:00
cryptk	ba984c7097	fix: pin version of setuptools for intel builds to work around #2406 (#2414 ) Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-05-26 18:27:07 +00:00
cryptk	16433d2e8e	fix: install pytorch from proper index for hipblas builds (#2413 ) Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-05-26 18:05:52 +00:00
Ettore Di Giacinto	1a3dedece0	dependencies(grpcio): bump to fix CI issues (#2362 ) feat(grpcio): bump to fix CI issues Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-21 14:33:47 +02:00
fakezeta	5b79bd04a7	add setuptools for openvino (#2301 )	2024-05-12 19:31:43 +00:00
cryptk	88942e4761	fix: add missing openvino/optimum/etc libraries for Intel, fixes #2289 (#2292 ) Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-05-12 09:01:45 +02:00
cryptk	e2de8a88f7	feat: create bash library to handle install/run/test of python backends (#2286 ) * feat: create bash library to handle install/run/test of python backends Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * chore: minor cleanup Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: remove incorrect LIMIT_TARGETS from parler-tts Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: update runUnitests to handle running tests from a custom test file Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * chore: document runUnittests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-05-11 18:32:46 +02:00
cryptk	28a421cb1d	feat: migrate python backends from conda to uv (#2215 ) * feat: migrate diffusers backend from conda to uv - replace conda with UV for diffusers install (prototype for all extras backends) - add ability to build docker with one/some/all extras backends instead of all or nothing Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate autogtpq bark coqui from conda to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: convert exllama over to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate exllama2 to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate mamba to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate parler to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate petals to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: fix tests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate rerankers to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate sentencetransformers to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: install uv for tests-linux Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: make sure file exists before installing on intel images Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate transformers backend to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate transformers-musicgen to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate vall-e-x to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate vllm to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add uv install to the rest of test-extra.yml Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: adjust file perms on all install/run/test scripts Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add missing acclerate dependencies Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add some more missing dependencies to python backends Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: parler tests venv py dir fix Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: correct filename for transformers-musicgen tests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: adjust the pwd for valle tests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: cleanup and optimization work for uv migration Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add setuptools to requirements-install for mamba Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: more size optimization work Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: make installs and tests more consistent, cleanup some deps Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: cleanup Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: mamba backend is cublas only Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: uncomment lines in makefile Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-05-10 15:08:08 +02:00
fakezeta	fea9522982	fix: OpenVINO winograd always disabled (#2252 ) Winograd convolutions were always disabled giving error when inference device was CPU. This commit implement logic to disable Winograd convolutions only if CPU or NPU are declared.	2024-05-07 08:38:58 +02:00
fakezeta	4690b534e0	feat: user defined inference device for CUDA and OpenVINO (#2212 ) user defined inference device configuration via main_gpu parameter	2024-05-02 09:54:29 +02:00
fakezeta	e38610e521	feat: OpenVINO acceleration for embeddings in transformer backend (#2190 ) OpenVINO acceleration for embeddings New argument type: OVModelForFeatureExtraction	2024-04-30 10:13:04 +02:00
fakezeta	c9451cb604	Bump oneapi-basekit, optimum and openvino (#2139 ) * Bump oneapi-basekit, optimum and openvino * Changed PERFORMANCE HINT to CUMULATIVE_THROUGHPUT Minor latency change for first token but about 10-15% speedup on token generation.	2024-04-26 16:20:43 +02:00
fakezeta	f718a391c0	fix missing TrustRemoteCode in OpenVINO model load (#2114 )	2024-04-24 00:45:37 +00:00
fakezeta	8e36fe9b6f	Transformers Backend: max_tokens adherence to OpenAI API (#2108 ) max token adherence to OpenAI API improve adherence to OpenAI API when max tokens is omitted or equal to 0 in the request	2024-04-23 18:42:17 +02:00
fakezeta	66b002458d	Transformer Backend: Implementing use_tokenizer_template and stop_prompts options (#2090 ) * fix regression #1971 fixes regression #1971 introduced by intel_extension_for_transformers==1.4 * UseTokenizerTemplate and StopPrompt Implementation of use_tokenizer_template and stopwords options	2024-04-21 16:20:25 +00:00
cryptk	1981154f49	fix: dont commit generated files to git (#1993 ) * fix: initial work towards not committing generated files to the repository Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: improve build docs Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: remove unused folder from .dockerignore and .gitignore Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: attempt to fix extra backend tests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: attempt to fix other tests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: more test fixes Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: fix apple tests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: more extras tests fixes Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add GOBIN to PATH in docker build Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: extra tests and Dockerfile corrections Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: remove build dependency checks Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add golang protobuf compilers to tests-linux action Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: ensure protogen is run for extra backend installs Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: use newer protobuf Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: more missing protoc binaries Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: missing dependencies during docker build Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: don't install grpc compilers in the final stage if they aren't needed Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: python-grpc-tools in 22.04 repos is too old Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add a couple of extra build dependencies to Makefile Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: unbreak container rebuild functionality Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-04-13 09:37:32 +02:00
Ludovic Leroux	12c0d9443e	feat: use tokenizer.apply_chat_template() in vLLM (#1990 ) Use tokenizer.apply_chat_template() in vLLM Signed-off-by: Ludovic LEROUX <ludovic@inpher.io>	2024-04-11 19:20:22 +02:00
fakezeta	a38618db02	fix regression #1971 (#1972 ) fixes regression #1971 introduced by intel_extension_for_transformers==1.4	2024-04-08 22:33:51 +02:00
fakezeta	8210ffcb6c	feat: Token Stream support for Transformer, fix: missing package for OpenVINO (#1908 ) * Streaming working * Small fix for regression on CUDA and XPU * use pip version of optimum[openvino] * Update backend/python/transformers/transformers_server.py Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> * Token streaming support fix optimum[openvino] package in install.sh * Token Streaming support --------- Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-03-27 17:50:35 +01:00
fakezeta	e7cbe32601	feat: Openvino runtime for transformer backend and streaming support for Openvino and CUDA (#1892 ) * fixes #1775 and #1774 Add BitsAndBytes Quantization and fixes embedding on CUDA devices * Manage 4bit and 8 bit quantization Manage different BitsAndBytes options with the quantization: parameter in yaml * fix compilation errors on non CUDA environment * OpenVINO draft First draft of OpenVINO integration in transformer backend * first working implementation * Streaming working * Small fix for regression on CUDA and XPU * use pip version of optimum[openvino] * Update backend/python/transformers/transformers_server.py Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-03-26 23:31:43 +00:00
Ettore Di Giacinto	20136ca8b7	feat(tts): add Elevenlabs and OpenAI TTS compatibility layer (#1834 ) * feat(elevenlabs): map elevenlabs API support to TTS This allows elevenlabs Clients to work automatically with LocalAI by supporting the elevenlabs API. The elevenlabs server endpoint is implemented such as it is wired to the TTS endpoints. Fixes: https://github.com/mudler/LocalAI/issues/1809 * feat(openai/tts): compat layer with openai tts Fixes: #1276 * fix: adapt tts CLI	2024-03-14 23:08:34 +01:00
fakezeta	3882130911	feat: Add Bitsandbytes quantization for transformer backend enhancement #1775 and fix: Transformer backend error on CUDA #1774 (#1823 ) * fixes #1775 and #1774 Add BitsAndBytes Quantization and fixes embedding on CUDA devices * Manage 4bit and 8 bit quantization Manage different BitsAndBytes options with the quantization: parameter in yaml * fix compilation errors on non CUDA environment	2024-03-14 23:06:30 +01:00
Ettore Di Giacinto	5d1018495f	feat(intel): add diffusers/transformers support (#1746 ) * feat(intel): add diffusers support * try to consume upstream container image * Debug * Manually install deps * Map transformers/hf cache dir to modelpath if not specified * fix(compel): update initialization, pass by all gRPC options * fix: add dependencies, implement transformers for xpu * base it from the oneapi image * Add pillow * set threads if specified when launching the API * Skip conda install if intel * defaults to non-intel * ci: add to pipelines * prepare compel only if enabled * Skip conda install if intel * fix cleanup * Disable compel by default * Install torch 2.1.0 with Intel * Skip conda on some setups * Detect python * Quiet output * Do not override system python with conda * Prefer python3 * Fixups * exllama2: do not install without conda (overrides pytorch version) * exllama/exllama2: do not install if not using cuda * Add missing dataset dependency * Small fixups, symlink to python, add requirements * Add neural_speed to the deps * correctly handle model offloading * fix: device_map == xpu * go back at calling python, fixed at dockerfile level * Exllama2 restricted to only nvidia gpus * Tokenizer to xpu	2024-03-07 14:37:45 +01:00
Dave	5c69dd155f	feat(autogpt/transformers): consume `trust_remote_code` (#1799 ) trusting remote code by default is a danger to our users	2024-03-05 19:47:15 +01:00
Ludovic Leroux	939411300a	Bump vLLM version + more options when loading models in vLLM (#1782 ) * Bump vLLM version to 0.3.2 * Add vLLM model loading options * Remove transformers-exllama * Fix install exllama	2024-03-01 22:48:53 +01:00
Ettore Di Giacinto	cb7512734d	transformers: correctly load automodels (#1643 ) * backends(transformers): use AutoModel with LLM types * examples: animagine-xl * Add codellama examples	2024-01-26 00:13:21 +01:00
Ettore Di Giacinto	5e335eaead	feat(transformers): support also text generation (#1630 ) * feat(transformers): support also text generation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * embedded: set seed -1 --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-01-23 23:07:31 +01:00
Ettore Di Giacinto	b4b21a446b	feat(conda): share envs with transformer-based backends (#1465 ) * feat(conda): share env between diffusers and bark * Detect if env already exists * share diffusers and petals * tests: add petals * Use smaller model for tests with petals * test only model load on petals * tests(petals): run only load model tests * Revert "test only model load on petals" This reverts commit `111cfa97f1`. * move transformers and sentencetransformers to common env * Share also transformers-musicgen	2023-12-21 08:35:15 +01:00
Ettore Di Giacinto	7641f92cde	feat(diffusers): update, add autopipeline, controlnet (#1432 ) * feat(diffusers): update, add autopipeline, controlenet * tests with AutoPipeline * simplify logic	2023-12-13 19:20:22 +01:00
Ettore Di Giacinto	48e5380e45	tests: add diffusers tests (#1419 )	2023-12-11 08:20:34 +01:00
Ettore Di Giacinto	887b3dff04	feat: cuda transformers (#1401 ) * Use cuda in transformers if available tensorflow probably needs a different check. Signed-off-by: Erich Schubert <kno10@users.noreply.github.com> * feat: expose CUDA at top level Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * tests: add to tests and create workflow for py extra backends * doc: update note on how to use core images --------- Signed-off-by: Erich Schubert <kno10@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Erich Schubert <kno10@users.noreply.github.com>	2023-12-08 15:45:04 +01:00
Dave	8b6e601405	Feat: new backend: transformers-musicgen (#1387 ) Transformers-MusicGen --------- Signed-off-by: Dave <dave@gray101.com>	2023-12-08 10:01:02 +01:00
B4ckslash	2d64d8b444	fix/docs: Python backend dependencies (#1360 ) * Update docs for new requirements.txt path Signed-off-by: Marcus Köhler <khler.marcus@gmail.com> * Fix typo (.PONY -> .PHONY) in python backend makefiles Signed-off-by: Marcus Köhler <khler.marcus@gmail.com> --------- Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>	2023-11-30 17:46:55 +01:00
Ettore Di Giacinto	c75bdd99e4	fix: rename transformers.py to avoid circular import (#1337 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-11-26 08:49:43 +01:00
B4ckslash	9dddd1134d	fix: move python header comments below shebang in some backends (#1321 ) * Fix python header comments for some extra gRPC backends When a Python script is to be executed directly via exec(3), either the platform knows how to execute the file itself (i.e. special configuration is necessary) or the first line contains a shebang (#!) specifying the interpreter to run it (similar to shell scripts). The shebang MUST be on the first line for the script to work on all platforms, so any header comments need to be in the lines following it. Otherwise executing these scripts as extra backends will yield an "exec format error" message. Changes: * Move introductory comments below the shebang line * Change header comment in transformers.py to refer to the correct python module Signed-off-by: Marcus Köhler <khler.marcus@gmail.com> * Make header comment in ttsbark.py more specific Signed-off-by: Marcus Köhler <khler.marcus@gmail.com> --------- Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>	2023-11-23 15:22:37 +01:00
Ettore Di Giacinto	92cbc4d516	feat(transformers): add embeddings with Automodel (#1308 ) * Update huggingface.py Switch SentenceTransformer for AutoModel in order to set trust_remote_code needed to use the encode method with embeddings models like jinai-v2 Signed-off-by: Lucas Hänke de Cansino <lhc@next-boss.eu> * feat(transformers): split in separate backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Lucas Hänke de Cansino <lhc@next-boss.eu> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Lucas Hänke de Cansino <lhc@next-boss.eu>	2023-11-20 21:21:17 +01:00

37 Commits