7ee93a8b5c
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/common/template ( #2802 )
...
chore(deps): Bump grpcio in /backend/python/common/template
Bumps [grpcio](https://github.com/grpc/grpc ) from 1.64.0 to 1.64.1.
- [Release notes](https://github.com/grpc/grpc/releases )
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md )
- [Commits](https://github.com/grpc/grpc/compare/v1.64.0...v1.64.1 )
---
updated-dependencies:
- dependency-name: grpcio
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-13 04:06:56 +00:00
6b59f79364
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/exllama2 ( #2809 )
...
chore(deps): Bump grpcio in /backend/python/exllama2
Bumps [grpcio](https://github.com/grpc/grpc ) from 1.64.0 to 1.64.1.
- [Release notes](https://github.com/grpc/grpc/releases )
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md )
- [Commits](https://github.com/grpc/grpc/compare/v1.64.0...v1.64.1 )
---
updated-dependencies:
- dependency-name: grpcio
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-13 03:46:28 +00:00
ffad7890fe
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/diffusers ( #2807 )
...
chore(deps): Bump setuptools in /backend/python/diffusers
Bumps [setuptools](https://github.com/pypa/setuptools ) from 69.5.1 to 70.3.0.
- [Release notes](https://github.com/pypa/setuptools/releases )
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst )
- [Commits](https://github.com/pypa/setuptools/compare/v69.5.1...v70.3.0 )
---
updated-dependencies:
- dependency-name: setuptools
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-13 03:26:48 +00:00
d08a963d1c
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/bark ( #2805 )
...
chore(deps): Bump setuptools in /backend/python/bark
Bumps [setuptools](https://github.com/pypa/setuptools ) from 69.5.1 to 70.3.0.
- [Release notes](https://github.com/pypa/setuptools/releases )
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst )
- [Commits](https://github.com/pypa/setuptools/compare/v69.5.1...v70.3.0 )
---
updated-dependencies:
- dependency-name: setuptools
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-13 02:54:42 +00:00
b3b8010930
chore(deps): Bump causal-conv1d from 1.2.0.post2 to 1.4.0 in /backend/python/mamba ( #2792 )
...
chore(deps): Bump causal-conv1d in /backend/python/mamba
Bumps [causal-conv1d](https://github.com/Dao-AILab/causal-conv1d ) from 1.2.0.post2 to 1.4.0.
- [Release notes](https://github.com/Dao-AILab/causal-conv1d/releases )
- [Commits](https://github.com/Dao-AILab/causal-conv1d/compare/v1.2.0.post2...v1.4.0 )
---
updated-dependencies:
- dependency-name: causal-conv1d
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-13 01:34:59 +00:00
30861f49a8
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/petals ( #2799 )
...
chore(deps): Bump setuptools in /backend/python/petals
Bumps [setuptools](https://github.com/pypa/setuptools ) from 69.5.1 to 70.3.0.
- [Release notes](https://github.com/pypa/setuptools/releases )
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst )
- [Commits](https://github.com/pypa/setuptools/compare/v69.5.1...v70.3.0 )
---
updated-dependencies:
- dependency-name: setuptools
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-13 01:20:56 +00:00
5345f30a33
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/parler-tts ( #2797 )
...
chore(deps): Bump setuptools in /backend/python/parler-tts
Bumps [setuptools](https://github.com/pypa/setuptools ) from 69.5.1 to 70.3.0.
- [Release notes](https://github.com/pypa/setuptools/releases )
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst )
- [Commits](https://github.com/pypa/setuptools/compare/v69.5.1...v70.3.0 )
---
updated-dependencies:
- dependency-name: setuptools
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-13 01:12:19 +00:00
de2bf82e09
chore(deps): Bump inflect from 7.0.0 to 7.3.1 in /backend/python/openvoice ( #2796 )
...
chore(deps): Bump inflect in /backend/python/openvoice
Bumps [inflect](https://github.com/jaraco/inflect ) from 7.0.0 to 7.3.1.
- [Release notes](https://github.com/jaraco/inflect/releases )
- [Changelog](https://github.com/jaraco/inflect/blob/main/NEWS.rst )
- [Commits](https://github.com/jaraco/inflect/compare/v7.0.0...v7.3.1 )
---
updated-dependencies:
- dependency-name: inflect
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-13 01:08:38 +00:00
67b20a7147
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/coqui ( #2798 )
...
chore(deps): Bump setuptools in /backend/python/coqui
Bumps [setuptools](https://github.com/pypa/setuptools ) from 69.5.1 to 70.3.0.
- [Release notes](https://github.com/pypa/setuptools/releases )
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst )
- [Commits](https://github.com/pypa/setuptools/compare/v69.5.1...v70.3.0 )
---
updated-dependencies:
- dependency-name: setuptools
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-13 00:43:54 +00:00
fc87507012
chore(deps): Update Dependencies ( #2538 )
...
* chore(deps): Update dependencies
Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com >
* chore(deps): Upgrade github.com/imdario/mergo to dario.cat/mergo
Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com >
* remove version identifiers for MeloTTS
Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com >
---------
Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com >
Signed-off-by: Dave <dave@gray101.com >
Co-authored-by: Dave <dave@gray101.com >
2024-07-12 19:54:08 +00:00
17608ea6aa
Using exec when starting a backend instead of spawning a new process ( #2720 )
...
Co-authored-by: Simon Siebert <ansiebert@deloitte.de >
2024-07-05 16:59:18 +00:00
ecbb61cbf4
feat(sd-3): add stablediffusion 3 support ( #2591 )
...
* feat(sd-3): add stablediffusion 3 support
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* deps(diffusers): add sentencepiece
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* models(gallery): add stablediffusion-3
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-18 15:09:39 +02:00
6ef78ef7f6
bugfix: CUDA acceleration not working ( #2475 )
...
* bugfix: CUDA acceleration not working
CUDA not working after #2286 .
Refactored the code to be more polish
* Update requirements.txt
Missing imports
Signed-off-by: fakezeta <fakezeta@gmail.com >
* Update requirements.txt
Signed-off-by: fakezeta <fakezeta@gmail.com >
---------
Signed-off-by: fakezeta <fakezeta@gmail.com >
2024-06-03 22:41:42 +02:00
4a239a4bff
feat(transformers): various enhancements to the transformers backend ( #2468 )
...
update transformers
*Handle Temperature = 0 as greedy search
*Handle custom works as stop words
*Implement KV cache
*Phi 3 no more requires trust_remote_code: true
2024-06-03 08:52:55 +02:00
b99182c8d4
TTS API improvements ( #2308 )
...
* update doc on COQUI_LANGUAGE env variable
Signed-off-by: blob42 <contact@blob42.xyz >
* return errors from tts gRPC backend
Signed-off-by: blob42 <contact@blob42.xyz >
* handle speaker_id and language in coqui TTS backend
Signed-off-by: blob42 <contact@blob42.xyz >
* TTS endpoint: add optional language paramter
Signed-off-by: blob42 <contact@blob42.xyz >
* tts fix: empty language string breaks non-multilingual models
Signed-off-by: blob42 <contact@blob42.xyz >
* allow tts param definition in config file
- consolidate TTS options under `tts` config entry
Signed-off-by: blob42 <contact@blob42.xyz >
* tts: update doc
Signed-off-by: blob42 <contact@blob42.xyz >
---------
Signed-off-by: blob42 <contact@blob42.xyz >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-06-01 18:26:27 +00:00
ba984c7097
fix: pin version of setuptools for intel builds to work around #2406 ( #2414 )
...
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-05-26 18:27:07 +00:00
16433d2e8e
fix: install pytorch from proper index for hipblas builds ( #2413 )
...
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-05-26 18:05:52 +00:00
1a3dedece0
dependencies(grpcio): bump to fix CI issues ( #2362 )
...
feat(grpcio): bump to fix CI issues
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-21 14:33:47 +02:00
8ad669339e
add openvoice backend ( #2334 )
...
Wip openvoice
2024-05-19 16:27:08 +02:00
86627b27f7
fix: add setuptools to all requirements-intel.txt files for python backends ( #2333 )
...
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-05-16 19:15:46 +02:00
5b79bd04a7
add setuptools for openvino ( #2301 )
2024-05-12 19:31:43 +00:00
88942e4761
fix: add missing openvino/optimum/etc libraries for Intel, fixes #2289 ( #2292 )
...
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-05-12 09:01:45 +02:00
e2de8a88f7
feat: create bash library to handle install/run/test of python backends ( #2286 )
...
* feat: create bash library to handle install/run/test of python backends
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* chore: minor cleanup
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: remove incorrect LIMIT_TARGETS from parler-tts
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: update runUnitests to handle running tests from a custom test file
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* chore: document runUnittests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
---------
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-05-11 18:32:46 +02:00
28a421cb1d
feat: migrate python backends from conda to uv ( #2215 )
...
* feat: migrate diffusers backend from conda to uv
- replace conda with UV for diffusers install (prototype for all
extras backends)
- add ability to build docker with one/some/all extras backends
instead of all or nothing
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate autogtpq bark coqui from conda to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: convert exllama over to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate exllama2 to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate mamba to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate parler to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate petals to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: fix tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate rerankers to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate sentencetransformers to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: install uv for tests-linux
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: make sure file exists before installing on intel images
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate transformers backend to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate transformers-musicgen to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate vall-e-x to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate vllm to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add uv install to the rest of test-extra.yml
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: adjust file perms on all install/run/test scripts
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add missing acclerate dependencies
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add some more missing dependencies to python backends
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: parler tests venv py dir fix
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: correct filename for transformers-musicgen tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: adjust the pwd for valle tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: cleanup and optimization work for uv migration
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add setuptools to requirements-install for mamba
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: more size optimization work
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: make installs and tests more consistent, cleanup some deps
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: cleanup
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: mamba backend is cublas only
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: uncomment lines in makefile
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
---------
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-05-10 15:08:08 +02:00
fea9522982
fix: OpenVINO winograd always disabled ( #2252 )
...
Winograd convolutions were always disabled giving error when inference device was CPU.
This commit implement logic to disable Winograd convolutions only if CPU or NPU are declared.
2024-05-07 08:38:58 +02:00
4690b534e0
feat: user defined inference device for CUDA and OpenVINO ( #2212 )
...
user defined inference device
configuration via main_gpu parameter
2024-05-02 09:54:29 +02:00
f7aabf1b50
fix: bring everything onto the same GRPC version to fix tests ( #2199 )
...
fix: more places where we are installing grpc that need a version specified
fix: attempt to fix metal tests
fix: metal/brew is forcing an update, they don't have 1.58 available anymore
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-04-30 19:12:15 +00:00
e38610e521
feat: OpenVINO acceleration for embeddings in transformer backend ( #2190 )
...
OpenVINO acceleration for embeddings
New argument type: OVModelForFeatureExtraction
2024-04-30 10:13:04 +02:00
b7ea9602f5
fix: undefined symbol: iJIT_NotifyEvent in import torch ##2153 ( #2179 )
...
* add extra index to Intel repository
* Update install.sh
2024-04-29 15:11:09 +02:00
c9451cb604
Bump oneapi-basekit, optimum and openvino ( #2139 )
...
* Bump oneapi-basekit, optimum and openvino
* Changed PERFORMANCE HINT to CUMULATIVE_THROUGHPUT
Minor latency change for first token but about 10-15% speedup on token generation.
2024-04-26 16:20:43 +02:00
b664edde29
feat(rerankers): Add new backend, support jina rerankers API ( #2121 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-04-25 00:19:02 +02:00
2fb34b00b5
Incl ocv pkg for diffsusers utils ( #2115 )
...
* Update diffusers.yml
Signed-off-by: jtwolfe <jamie.t.wolfe@gmail.com >
* Update diffusers-rocm.yml
Signed-off-by: jtwolfe <jamie.t.wolfe@gmail.com >
---------
Signed-off-by: jtwolfe <jamie.t.wolfe@gmail.com >
2024-04-24 09:17:49 +02:00
f718a391c0
fix missing TrustRemoteCode in OpenVINO model load ( #2114 )
2024-04-24 00:45:37 +00:00
8e36fe9b6f
Transformers Backend: max_tokens adherence to OpenAI API ( #2108 )
...
max token adherence to OpenAI API
improve adherence to OpenAI API when max tokens is omitted or equal to 0 in the request
2024-04-23 18:42:17 +02:00
66b002458d
Transformer Backend: Implementing use_tokenizer_template and stop_prompts options ( #2090 )
...
* fix regression #1971
fixes regression #1971 introduced by intel_extension_for_transformers==1.4
* UseTokenizerTemplate and StopPrompt
Implementation of use_tokenizer_template and stopwords options
2024-04-21 16:20:25 +00:00
03adc1f60d
Add tensor_parallel_size setting to vllm setting items ( #2085 )
...
Signed-off-by: Taikono-Himazin <kazu@po.harenet.ne.jp >
2024-04-20 14:37:02 +00:00
0fdff26924
feat(parler-tts): Add new backend ( #2027 )
...
* feat(parler-tts): Add new backend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(parler-tts): try downgrade protobuf
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(parler-tts): add parler conda env
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Revert "feat(parler-tts): try downgrade protobuf"
This reverts commit bd5941d5cfc00676b45a99f71debf3c34249cf3c.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* deps: add grpc
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix: try to gen proto with same environment
* workaround
* Revert "fix: try to gen proto with same environment"
This reverts commit 998c745e2f
.
* Workaround fixup
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
Co-authored-by: Dave <dave@gray101.com >
2024-04-13 18:59:21 +02:00
1981154f49
fix: dont commit generated files to git ( #1993 )
...
* fix: initial work towards not committing generated files to the repository
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: improve build docs
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: remove unused folder from .dockerignore and .gitignore
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: attempt to fix extra backend tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: attempt to fix other tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: more test fixes
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: fix apple tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: more extras tests fixes
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add GOBIN to PATH in docker build
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: extra tests and Dockerfile corrections
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: remove build dependency checks
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add golang protobuf compilers to tests-linux action
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: ensure protogen is run for extra backend installs
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: use newer protobuf
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: more missing protoc binaries
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: missing dependencies during docker build
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: don't install grpc compilers in the final stage if they aren't needed
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: python-grpc-tools in 22.04 repos is too old
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add a couple of extra build dependencies to Makefile
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: unbreak container rebuild functionality
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
---------
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-04-13 09:37:32 +02:00
12c0d9443e
feat: use tokenizer.apply_chat_template() in vLLM ( #1990 )
...
Use tokenizer.apply_chat_template() in vLLM
Signed-off-by: Ludovic LEROUX <ludovic@inpher.io >
2024-04-11 19:20:22 +02:00
b4548ad72d
feat: add flash-attn in nvidia and rocm envs ( #1995 )
...
Signed-off-by: Ludovic LEROUX <ludovic@inpher.io >
2024-04-11 09:44:39 +02:00
36da11a0ee
deps: Update version of vLLM to add support of Cohere Command_R model in vLLM inference ( #1975 )
...
* Update vLLM version to add support of Command_R
Signed-off-by: Koen Farell <hellios.dt@gmail.com >
* fix: Fixed vllm version from requirements
Signed-off-by: Koen Farell <hellios.dt@gmail.com >
* chore: Update transformers-rocm.yml
Signed-off-by: Koen Farell <hellios.dt@gmail.com >
* chore: Update transformers.yml version of vllm
Signed-off-by: Koen Farell <hellios.dt@gmail.com >
---------
Signed-off-by: Koen Farell <hellios.dt@gmail.com >
2024-04-10 11:25:26 +00:00
d23e73b118
fix(autogptq): do not use_triton with qwen-vl ( #1985 )
...
* Enhance autogptq backend to support VL models
* update dependencies for autogptq
* remove redundant auto-gptq dependency
* Convert base64 to image_url for Qwen-VL model
* implemented model inference for qwen-vl
* remove user prompt from generated answer
* fixed write image error
* fixed use_triton issue when loading Qwen-VL model
---------
Co-authored-by: Binghua Wu <bingwu@estee.com >
2024-04-10 10:36:10 +00:00
a38618db02
fix regression #1971 ( #1972 )
...
fixes regression #1971 introduced by intel_extension_for_transformers==1.4
2024-04-08 22:33:51 +02:00
8210ffcb6c
feat: Token Stream support for Transformer, fix: missing package for OpenVINO ( #1908 )
...
* Streaming working
* Small fix for regression on CUDA and XPU
* use pip version of optimum[openvino]
* Update backend/python/transformers/transformers_server.py
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
* Token streaming support
fix optimum[openvino] package in install.sh
* Token Streaming support
---------
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-03-27 17:50:35 +01:00
e7cbe32601
feat: Openvino runtime for transformer backend and streaming support for Openvino and CUDA ( #1892 )
...
* fixes #1775 and #1774
Add BitsAndBytes Quantization and fixes embedding on CUDA devices
* Manage 4bit and 8 bit quantization
Manage different BitsAndBytes options with the quantization: parameter in yaml
* fix compilation errors on non CUDA environment
* OpenVINO draft
First draft of OpenVINO integration in transformer backend
* first working implementation
* Streaming working
* Small fix for regression on CUDA and XPU
* use pip version of optimum[openvino]
* Update backend/python/transformers/transformers_server.py
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
---------
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-03-26 23:31:43 +00:00
607586e0b7
fix: downgrade torch ( #1902 )
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-03-26 22:56:02 +01:00
b7ffe66219
Enhance autogptq backend to support VL models ( #1860 )
...
* Enhance autogptq backend to support VL models
* update dependencies for autogptq
* remove redundant auto-gptq dependency
* Convert base64 to image_url for Qwen-VL model
* implemented model inference for qwen-vl
* remove user prompt from generated answer
* fixed write image error
---------
Co-authored-by: Binghua Wu <bingwu@estee.com >
2024-03-26 18:48:14 +01:00
20136ca8b7
feat(tts): add Elevenlabs and OpenAI TTS compatibility layer ( #1834 )
...
* feat(elevenlabs): map elevenlabs API support to TTS
This allows elevenlabs Clients to work automatically with LocalAI by
supporting the elevenlabs API.
The elevenlabs server endpoint is implemented such as it is wired to the
TTS endpoints.
Fixes: https://github.com/mudler/LocalAI/issues/1809
* feat(openai/tts): compat layer with openai tts
Fixes : #1276
* fix: adapt tts CLI
2024-03-14 23:08:34 +01:00
3882130911
feat: Add Bitsandbytes quantization for transformer backend enhancement #1775 and fix: Transformer backend error on CUDA #1774 ( #1823 )
...
* fixes #1775 and #1774
Add BitsAndBytes Quantization and fixes embedding on CUDA devices
* Manage 4bit and 8 bit quantization
Manage different BitsAndBytes options with the quantization: parameter in yaml
* fix compilation errors on non CUDA environment
2024-03-14 23:06:30 +01:00
5d1018495f
feat(intel): add diffusers/transformers support ( #1746 )
...
* feat(intel): add diffusers support
* try to consume upstream container image
* Debug
* Manually install deps
* Map transformers/hf cache dir to modelpath if not specified
* fix(compel): update initialization, pass by all gRPC options
* fix: add dependencies, implement transformers for xpu
* base it from the oneapi image
* Add pillow
* set threads if specified when launching the API
* Skip conda install if intel
* defaults to non-intel
* ci: add to pipelines
* prepare compel only if enabled
* Skip conda install if intel
* fix cleanup
* Disable compel by default
* Install torch 2.1.0 with Intel
* Skip conda on some setups
* Detect python
* Quiet output
* Do not override system python with conda
* Prefer python3
* Fixups
* exllama2: do not install without conda (overrides pytorch version)
* exllama/exllama2: do not install if not using cuda
* Add missing dataset dependency
* Small fixups, symlink to python, add requirements
* Add neural_speed to the deps
* correctly handle model offloading
* fix: device_map == xpu
* go back at calling python, fixed at dockerfile level
* Exllama2 restricted to only nvidia gpus
* Tokenizer to xpu
2024-03-07 14:37:45 +01:00