Commit Graph

38 Commits

Author SHA1 Message Date
Ettore Di Giacinto
68fc014c6d
feat(vllm): add support for embeddings (#3440)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-02 21:44:32 +02:00
Ettore Di Giacinto
11d960b2a6
chore(cli): be consistent between workers and expose ExtraLLamaCPPArgs to both (#3428)
* chore(cli): be consistent between workers and expose ExtraLLamaCPPArgs to both

Fixes: https://github.com/mudler/LocalAI/issues/3427

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* bump grpcio

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-30 00:10:17 +02:00
Ettore Di Giacinto
a9c521eb41
fix(deps): bump grpcio (#3362)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-23 10:29:04 +02:00
dependabot[bot]
a7a27a5082
chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/vllm (#3301)
Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.4 to 1.65.5.
- [Release notes](https://github.com/grpc/grpc/releases)
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md)
- [Commits](https://github.com/grpc/grpc/compare/v1.65.4...v1.65.5)

---
updated-dependencies:
- dependency-name: grpcio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-20 05:39:54 +00:00
Ettore Di Giacinto
2c8623dbb4 fix(python): move vllm to after deps, drop diffusers main deps
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-07 23:34:37 +02:00
Ettore Di Giacinto
66cf38b0b3
feat(venv): shared env (#3195)
* feat(venv): allow to share veenvs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(vllm): add back flash-attn

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-07 19:45:14 +02:00
Ettore Di Giacinto
11b2adae0c
fix(vllm): drop flash-attn installation afterwards
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-08-07 18:08:26 +02:00
Ettore Di Giacinto
61b5602111
fix(python): move accelerate and GPU-specific libs to build-type (#3194)
Some of the dependencies in `requirements.txt`, even if generic, pulls
down the line CUDA libraries.

This changes moves mostly all GPU-specific libs to the build-type, and
tries a safer approach. In `requirements.txt` now are listed only
"first-level" dependencies, for instance, grpc, but libs-dependencies
are moved down to the respective build-type `requirements.txt` to avoid
any mixin.

This should fix #2737 and #1592.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-07 17:02:32 +02:00
dependabot[bot]
e1e221b6e5
chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/vllm (#3152)
Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4.
- [Release notes](https://github.com/grpc/grpc/releases)
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md)
- [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4)

---
updated-dependencies:
- dependency-name: grpcio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-06 03:12:15 +00:00
cryptk
ed322bf59f
fix: ensure correct version of torch is always installed based on BUILD_TYPE(#2890)
* fix: ensure correct version of torch is always installed based on BUILD_TYPE

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* Move causal-conv1d installation to build_types

Signed-off-by: mudler <mudler@localai.io>

* Move mamba-ssd install to build-type requirements.txt

Signed-off-by: mudler <mudler@localai.io>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
Signed-off-by: mudler <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: mudler <mudler@localai.io>
2024-08-05 16:38:33 +00:00
Ettore Di Giacinto
f1e90575f3
Revert "chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/vllm" (#3079)
Revert "chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python…"

This reverts commit 5c747a16c4.
2024-07-30 09:21:45 +02:00
dependabot[bot]
5c747a16c4
chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/vllm (#3061)
chore(deps): Bump setuptools in /backend/python/vllm

Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 72.1.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v72.1.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-30 00:43:12 +00:00
dependabot[bot]
a1bc2e9771
chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/vllm (#2964)
Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1.
- [Release notes](https://github.com/grpc/grpc/releases)
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md)
- [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1)

---
updated-dependencies:
- dependency-name: grpcio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-23 00:08:22 +00:00
cryptk
a3eb6e04c1
fix: update grpcio version to match version used in builds (#2888)
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-07-16 01:39:10 +00:00
dependabot[bot]
2bbbfa849f
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/vllm (#2839)
Bumps [grpcio](https://github.com/grpc/grpc) from 1.64.0 to 1.64.1.
- [Release notes](https://github.com/grpc/grpc/releases)
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md)
- [Commits](https://github.com/grpc/grpc/compare/v1.64.0...v1.64.1)

---
updated-dependencies:
- dependency-name: grpcio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-13 16:29:35 +00:00
dependabot[bot]
88aff0bc99
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/vllm (#2820)
chore(deps): Bump setuptools in /backend/python/vllm

Bumps [setuptools](https://github.com/pypa/setuptools) from 69.5.1 to 70.3.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v69.5.1...v70.3.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-13 06:45:29 +00:00
cryptk
ba984c7097
fix: pin version of setuptools for intel builds to work around #2406 (#2414)
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-05-26 18:27:07 +00:00
cryptk
16433d2e8e
fix: install pytorch from proper index for hipblas builds (#2413)
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-05-26 18:05:52 +00:00
Ettore Di Giacinto
1a3dedece0
dependencies(grpcio): bump to fix CI issues (#2362)
feat(grpcio): bump to fix CI issues

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-21 14:33:47 +02:00
cryptk
86627b27f7
fix: add setuptools to all requirements-intel.txt files for python backends (#2333)
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-05-16 19:15:46 +02:00
cryptk
88942e4761
fix: add missing openvino/optimum/etc libraries for Intel, fixes #2289 (#2292)
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-05-12 09:01:45 +02:00
cryptk
e2de8a88f7
feat: create bash library to handle install/run/test of python backends (#2286)
* feat: create bash library to handle install/run/test of python backends

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* chore: minor cleanup

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: remove incorrect LIMIT_TARGETS from parler-tts

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: update runUnitests to handle running tests from a custom test file

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* chore: document runUnittests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-05-11 18:32:46 +02:00
cryptk
28a421cb1d
feat: migrate python backends from conda to uv (#2215)
* feat: migrate diffusers backend from conda to uv

  - replace conda with UV for diffusers install (prototype for all
    extras backends)
  - add ability to build docker with one/some/all extras backends
    instead of all or nothing

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate autogtpq bark coqui from conda to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: convert exllama over to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate exllama2 to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate mamba to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate parler to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate petals to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: fix tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate rerankers to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate sentencetransformers to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: install uv for tests-linux

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: make sure file exists before installing on intel images

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate transformers backend to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate transformers-musicgen to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate vall-e-x to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate vllm to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add uv install to the rest of test-extra.yml

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: adjust file perms on all install/run/test scripts

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add missing acclerate dependencies

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add some more missing dependencies to python backends

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: parler tests venv py dir fix

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: correct filename for transformers-musicgen tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: adjust the pwd for valle tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: cleanup and optimization work for uv migration

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add setuptools to requirements-install for mamba

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: more size optimization work

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: make installs and tests more consistent, cleanup some deps

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: cleanup

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: mamba backend is cublas only

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: uncomment lines in makefile

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-05-10 15:08:08 +02:00
Taikono-Himazin
03adc1f60d
Add tensor_parallel_size setting to vllm setting items (#2085)
Signed-off-by: Taikono-Himazin <kazu@po.harenet.ne.jp>
2024-04-20 14:37:02 +00:00
cryptk
1981154f49
fix: dont commit generated files to git (#1993)
* fix: initial work towards not committing generated files to the repository

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: improve build docs

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: remove unused folder from .dockerignore and .gitignore

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: attempt to fix extra backend tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: attempt to fix other tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: more test fixes

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: fix apple tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: more extras tests fixes

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add GOBIN to PATH in docker build

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: extra tests and Dockerfile corrections

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: remove build dependency checks

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add golang protobuf compilers to tests-linux action

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: ensure protogen is run for extra backend installs

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: use newer protobuf

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: more missing protoc binaries

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: missing dependencies during docker build

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: don't install grpc compilers in the final stage if they aren't needed

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: python-grpc-tools in 22.04 repos is too old

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add a couple of extra build dependencies to Makefile

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: unbreak container rebuild functionality

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-13 09:37:32 +02:00
Ludovic Leroux
12c0d9443e
feat: use tokenizer.apply_chat_template() in vLLM (#1990)
Use tokenizer.apply_chat_template() in vLLM

Signed-off-by: Ludovic LEROUX <ludovic@inpher.io>
2024-04-11 19:20:22 +02:00
Ettore Di Giacinto
20136ca8b7
feat(tts): add Elevenlabs and OpenAI TTS compatibility layer (#1834)
* feat(elevenlabs): map elevenlabs API support to TTS

This allows elevenlabs Clients to work automatically with LocalAI by
supporting the elevenlabs API.

The elevenlabs server endpoint is implemented such as it is wired to the
TTS endpoints.

Fixes: https://github.com/mudler/LocalAI/issues/1809

* feat(openai/tts): compat layer with openai tts

Fixes: #1276

* fix: adapt tts CLI
2024-03-14 23:08:34 +01:00
Ludovic Leroux
939411300a
Bump vLLM version + more options when loading models in vLLM (#1782)
* Bump vLLM version to 0.3.2

* Add vLLM model loading options

* Remove transformers-exllama

* Fix install exllama
2024-03-01 22:48:53 +01:00
Ludovic Leroux
0135e1e3b9
fix: vllm - use AsyncLLMEngine to allow true streaming mode (#1749)
* fix: use vllm AsyncLLMEngine to bring true stream

Current vLLM implementation uses the LLMEngine, which was designed for offline batch inference, which results in the streaming mode outputing all blobs at once at the end of the inference.

This PR reworks the gRPC server to use asyncio and gRPC.aio, in combination with vLLM's AsyncLLMEngine to bring true stream mode.

This PR also passes more parameters to vLLM during inference (presence_penalty, frequency_penalty, stop, ignore_eos, seed, ...).

* Remove unused import
2024-02-24 11:48:45 +01:00
Ettore Di Giacinto
cb7512734d
transformers: correctly load automodels (#1643)
* backends(transformers): use AutoModel with LLM types

* examples: animagine-xl

* Add codellama examples
2024-01-26 00:13:21 +01:00
Ettore Di Giacinto
06cd9ef98d
feat(extra-backends): Improvements, adding mamba example (#1618)
* feat(extra-backends): Improvements

vllm: add max_tokens, wire up stream event
mamba: fixups, adding examples for mamba-chat

* examples(mamba-chat): add

* docs: update
2024-01-20 17:56:08 +01:00
Ettore Di Giacinto
949da7792d
deps(conda): use transformers-env with vllm,exllama(2) (#1554)
* deps(conda): use transformers with vllm

* join vllm, exllama, exllama2, split petals
2024-01-06 13:32:28 +01:00
Ettore Di Giacinto
b4b21a446b
feat(conda): share envs with transformer-based backends (#1465)
* feat(conda): share env between diffusers and bark

* Detect if env already exists

* share diffusers and petals

* tests: add petals

* Use smaller model for tests with petals

* test only model load on petals

* tests(petals): run only load model tests

* Revert "test only model load on petals"

This reverts commit 111cfa97f1.

* move transformers and sentencetransformers to common env

* Share also transformers-musicgen
2023-12-21 08:35:15 +01:00
Ettore Di Giacinto
7641f92cde
feat(diffusers): update, add autopipeline, controlnet (#1432)
* feat(diffusers): update, add autopipeline, controlenet

* tests with AutoPipeline

* simplify logic
2023-12-13 19:20:22 +01:00
Ettore Di Giacinto
9aa2a7ca13
extras: add vllm,bark,vall-e-x tests, bump diffusers (#1422)
* tests: add vllm

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* tests: Add vall-e-x tests

* Add bark tests

* bump diffusers

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-12-12 00:39:26 +01:00
Dave
8b6e601405
Feat: new backend: transformers-musicgen (#1387)
Transformers-MusicGen
---------

Signed-off-by: Dave <dave@gray101.com>
2023-12-08 10:01:02 +01:00
B4ckslash
2d64d8b444
fix/docs: Python backend dependencies (#1360)
* Update docs for new requirements.txt path

Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>

* Fix typo (.PONY -> .PHONY) in python backend makefiles

Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>

---------

Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>
2023-11-30 17:46:55 +01:00
Ettore Di Giacinto
ad0e30bca5
refactor: move backends into the backends directory (#1279)
* refactor: move backends into the backends directory

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactor: move main close to implementation for every backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-13 22:40:16 +01:00