2d64269763
feat: Add backend gallery ( #5607 )
...
build python backend container images / backend-jobs (rerankers, rocm/dev-ubuntu-22.04:6.1, hipblas, , , latest-gpu-rocm-hipblas-rerankers, linux/amd64, arc-runner-set, true, -gpu-rocm-hipblas-rerankers) (push) Has been cancelled
build python backend container images / backend-jobs (rerankers, ubuntu:22.04, cublas, 11, 7, latest-gpu-nvidia-cuda-11-rerankers, linux/amd64, arc-runner-set, true, -gpu-nvidia-cuda-11-rerankers) (push) Has been cancelled
build python backend container images / backend-jobs (rerankers, ubuntu:22.04, cublas, 12, 0, latest-gpu-nvidia-cuda-12-rerankers, linux/amd64, arc-runner-set, true, -gpu-nvidia-cuda-12-rerankers) (push) Has been cancelled
build python backend container images / backend-jobs (transformers, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, , , latest-gpu-intel-sycl-f16-transformers, linux/amd64, arc-runner-set, true, -gpu-intel-sycl-f16-transformers) (push) Has been cancelled
build python backend container images / backend-jobs (transformers, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, , , latest-gpu-intel-sycl-f32-transformers, linux/amd64, arc-runner-set, true, -gpu-intel-sycl-f32-transformers) (push) Has been cancelled
build python backend container images / backend-jobs (transformers, rocm/dev-ubuntu-22.04:6.1, hipblas, , , latest-gpu-rocm-hipblas-transformers, linux/amd64, arc-runner-set, true, -gpu-rocm-hipblas-transformers) (push) Has been cancelled
build python backend container images / backend-jobs (transformers, ubuntu:22.04, cublas, 11, 7, latest-gpu-nvidia-cuda-11-transformers, linux/amd64, arc-runner-set, true, -gpu-nvidia-cuda-11-transformers) (push) Has been cancelled
build python backend container images / backend-jobs (transformers, ubuntu:22.04, cublas, 12, 0, latest-gpu-nvidia-cuda-12-transformers, linux/amd64, arc-runner-set, true, -gpu-nvidia-cuda-12-transformers) (push) Has been cancelled
build python backend container images / backend-jobs (vllm, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, , , latest-gpu-intel-sycl-f16-vllm, linux/amd64, arc-runner-set, true, -gpu-intel-sycl-f16-vllm) (push) Has been cancelled
build python backend container images / backend-jobs (vllm, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, , , latest-gpu-intel-sycl-f32-vllm, linux/amd64, arc-runner-set, true, -gpu-intel-sycl-f32-vllm) (push) Has been cancelled
build python backend container images / backend-jobs (vllm, rocm/dev-ubuntu-22.04:6.1, hipblas, , , latest-gpu-rocm-hipblas-vllm, linux/amd64, arc-runner-set, true, -gpu-rocm-hipblas-vllm) (push) Has been cancelled
build python backend container images / backend-jobs (vllm, ubuntu:22.04, cublas, 11, 7, latest-gpu-nvidia-cuda-11-vllm, linux/amd64, arc-runner-set, true, -gpu-nvidia-cuda-11-vllm) (push) Has been cancelled
build python backend container images / backend-jobs (vllm, ubuntu:22.04, cublas, 12, 0, latest-gpu-nvidia-cuda-12-vllm, linux/amd64, arc-runner-set, true, -gpu-nvidia-cuda-12-vllm) (push) Has been cancelled
Security Scan / tests (push) Has been cancelled
Tests extras backends / tests-transformers (push) Has been cancelled
Tests extras backends / tests-rerankers (push) Has been cancelled
Tests extras backends / tests-diffusers (push) Has been cancelled
Tests extras backends / tests-coqui (push) Has been cancelled
tests / tests-linux (1.21.x) (push) Has been cancelled
tests / tests-aio-container (push) Has been cancelled
tests / tests-apple (1.21.x) (push) Has been cancelled
Update swagger / swagger (push) Has been cancelled
Check if checksums are up-to-date / checksum_check (push) Has been cancelled
Bump dependencies / bump (mudler/LocalAI) (push) Has been cancelled
Bump dependencies / bump (main, PABannier/bark.cpp, BARKCPP_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, ggml-org/llama.cpp, CPPLLAMA_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, ggml-org/whisper.cpp, WHISPER_CPP_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, leejet/stable-diffusion.cpp, STABLEDIFFUSION_GGML_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, mudler/go-piper, PIPER_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, mudler/go-stable-diffusion, STABLEDIFFUSION_VERSION) (push) Has been cancelled
* feat: Add backend gallery
This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Add backends docs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* wip: Backend Dockerfile for python backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat: drop extras images, build python backends separately
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fixup on all backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* test CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Tweaks
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Drop old backends leftovers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Fixup CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Move dockerfile upper
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Fix proto
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Feature dropped for consistency - we prefer model galleries
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Add missing packages in the build image
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* exllama is ponly available on cublas
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* pin torch on chatterbox
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Fixups to index
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Debug CI
* Install accellerators deps
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Add target arch
* Add cuda minor version
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Use self-hosted runners
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* ci: use quay for test images
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fixups for vllm and chatterbox
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Small fixups on CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* chatterbox is only available for nvidia
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Simplify CI builds
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Adapt test, use qwen3
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* chore(model gallery): add jina-reranker-v1-tiny-en-gguf
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Use reranker from llama.cpp in AIO images
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Limit concurrent jobs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2025-06-15 14:56:52 +02:00
45c58752e5
feat(ui): add audio upload button in chat view ( #5526 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-05-30 16:47:31 +02:00
bf6426aef2
feat: Realtime API support reboot ( #5392 )
...
Explorer deployment / build-linux (push) Has been cancelled
GPU tests / ubuntu-latest (1.21.x) (push) Has been cancelled
generate and publish intel docker caches / generate_caches (intel/oneapi-basekit:2025.1.0-0-devel-ubuntu22.04, linux/amd64, ubuntu-latest) (push) Has been cancelled
build container images / hipblas-jobs (-aio-gpu-hipblas, rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, extras, latest-gpu-hipblas-extras, latest-aio-gpu-hipblas, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -hipblas-extras) (push) Has been cancelled
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, core, latest-gpu-hipblas, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16-extras, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-… (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32-extras, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-… (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11-extras, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda11-extras) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12-extras, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda12-extras) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, latest-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, latest-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32) (push) Has been cancelled
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, ) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, latest-gpu-nvidia-cuda-12, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, latest-gpu-nvidia-cuda-12, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-gpu-vulkan, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan) (push) Has been cancelled
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64) (push) Has been cancelled
Security Scan / tests (push) Has been cancelled
Tests extras backends / tests-transformers (push) Has been cancelled
Tests extras backends / tests-rerankers (push) Has been cancelled
Tests extras backends / tests-diffusers (push) Has been cancelled
Tests extras backends / tests-coqui (push) Has been cancelled
tests / tests-linux (1.21.x) (push) Has been cancelled
tests / tests-aio-container (push) Has been cancelled
tests / tests-apple (1.21.x) (push) Has been cancelled
* feat(realtime): Initial Realtime API implementation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* chore: go mod tidy
Signed-off-by: Richard Palethorpe <io@richiejp.com >
* feat: Implement transcription only mode for realtime API
Reduce the scope of the real time API for the initial realease and make
transcription only mode functional.
Signed-off-by: Richard Palethorpe <io@richiejp.com >
* chore(build): Build backends on a separate layer to speed up core only changes
Signed-off-by: Richard Palethorpe <io@richiejp.com >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
Signed-off-by: Richard Palethorpe <io@richiejp.com >
Co-authored-by: Ettore Di Giacinto <mudler@localai.io >
2025-05-25 22:25:05 +02:00
21bdfe5fa4
fix: use rice when embedding large binaries ( #5309 )
...
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f16) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f32) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda11) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda12) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, latest-gpu-intel-f16-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-core) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, latest-gpu-intel-f32-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-core) (push) Has been cancelled
build container images / self-hosted-jobs (ubuntu:22.04, , true, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, ) (push) Has been cancelled
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, -core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, latest-gpu-nvidia-cuda-12-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, latest-gpu-nvidia-cuda-12-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-gpu-vulkan-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan-core) (push) Has been cancelled
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64-core, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64-core) (push) Has been cancelled
Security Scan / tests (push) Has been cancelled
Tests extras backends / tests-transformers (push) Has been cancelled
Tests extras backends / tests-rerankers (push) Has been cancelled
Tests extras backends / tests-diffusers (push) Has been cancelled
Tests extras backends / tests-coqui (push) Has been cancelled
tests / tests-linux (1.21.x) (push) Has been cancelled
tests / tests-aio-container (push) Has been cancelled
tests / tests-apple (1.21.x) (push) Has been cancelled
Update swagger / swagger (push) Has been cancelled
Check if checksums are up-to-date / checksum_check (push) Has been cancelled
Bump dependencies / bump (mudler/LocalAI) (push) Has been cancelled
Bump dependencies / bump (main, PABannier/bark.cpp, BARKCPP_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, ggerganov/whisper.cpp, WHISPER_CPP_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, ggml-org/llama.cpp, CPPLLAMA_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, leejet/stable-diffusion.cpp, STABLEDIFFUSION_GGML_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, mudler/go-piper, PIPER_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, mudler/go-stable-diffusion, STABLEDIFFUSION_VERSION) (push) Has been cancelled
generate and publish GRPC docker caches / generate_caches (ubuntu:22.04, linux/amd64,linux/arm64, arc-runner-set) (push) Has been cancelled
* fix(embed): use go-rice for large backend assets
Golang embed FS has a hard limit that we might exceed when providing
many binary alternatives.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* simplify golang deps
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* chore(tests): switch to testcontainers and print logs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix(tests): do not build a test binary
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* small fixup
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-05-04 16:42:42 +02:00
5c6cd50ed6
feat(llama.cpp): estimate vram usage ( #5299 )
...
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f16) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f32) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda11) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda12) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, latest-gpu-intel-f16-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-core) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, latest-gpu-intel-f32-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-core) (push) Has been cancelled
build container images / self-hosted-jobs (ubuntu:22.04, , true, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, ) (push) Has been cancelled
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, -core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, latest-gpu-nvidia-cuda-12-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, latest-gpu-nvidia-cuda-12-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-gpu-vulkan-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan-core) (push) Has been cancelled
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64-core, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64-core) (push) Has been cancelled
Security Scan / tests (push) Has been cancelled
Tests extras backends / tests-transformers (push) Has been cancelled
Tests extras backends / tests-rerankers (push) Has been cancelled
Tests extras backends / tests-diffusers (push) Has been cancelled
Tests extras backends / tests-coqui (push) Has been cancelled
tests / tests-linux (1.21.x) (push) Has been cancelled
tests / tests-aio-container (push) Has been cancelled
tests / tests-apple (1.21.x) (push) Has been cancelled
Update swagger / swagger (push) Has been cancelled
Check if checksums are up-to-date / checksum_check (push) Has been cancelled
Bump dependencies / bump (mudler/LocalAI) (push) Has been cancelled
Bump dependencies / bump (main, PABannier/bark.cpp, BARKCPP_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, ggerganov/whisper.cpp, WHISPER_CPP_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, ggml-org/llama.cpp, CPPLLAMA_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, leejet/stable-diffusion.cpp, STABLEDIFFUSION_GGML_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, mudler/go-piper, PIPER_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, mudler/go-stable-diffusion, STABLEDIFFUSION_VERSION) (push) Has been cancelled
generate and publish GRPC docker caches / generate_caches (ubuntu:22.04, linux/amd64,linux/arm64, arc-runner-set) (push) Has been cancelled
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-05-02 17:40:26 +02:00
6424f0666d
chore(deps): Bump edgevpn to v0.30.1 ( #4840 )
...
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas-core) (push) Waiting to run
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, false, ubuntu:22.04, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas) (push) Waiting to run
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f16-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f32-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda11-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda12-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, , , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, ) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, , true, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 11, 7, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda11) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 12, 0, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda12) (push) Waiting to run
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, -ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-vulkan-ffmpeg-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan-ffmpeg-core) (push) Waiting to run
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64-core, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64-core) (push) Waiting to run
Security Scan / tests (push) Waiting to run
Tests extras backends / tests-transformers (push) Waiting to run
Tests extras backends / tests-rerankers (push) Waiting to run
Tests extras backends / tests-diffusers (push) Waiting to run
Tests extras backends / tests-coqui (push) Waiting to run
tests / tests-linux (1.21.x) (push) Waiting to run
tests / tests-aio-container (push) Waiting to run
tests / tests-apple (1.21.x) (push) Waiting to run
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-02-17 16:51:22 +01:00
3cddf24747
feat: Centralized Request Processing middleware ( #3847 )
...
* squash past, centralize request middleware PR
Signed-off-by: Dave Lee <dave@gray101.com >
* migrate bruno request files to examples repo
Signed-off-by: Dave Lee <dave@gray101.com >
* fix
Signed-off-by: Dave Lee <dave@gray101.com >
* Update tests/e2e-aio/e2e_test.go
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
---------
Signed-off-by: Dave Lee <dave@gray101.com >
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2025-02-10 12:06:16 +01:00
032a33de49
chore: remove deprecated tinydream backend ( #4631 )
...
Signed-off-by: Gianluca Boiano <morf3089@gmail.com >
2025-01-18 18:35:30 +01:00
4426efab05
chore(deps): bump edgevpn to v0.29.0 ( #4564 )
...
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f32-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda11-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda12-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, , , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, ) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, , true, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 11, 7, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda11) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 12, 0, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda12) (push) Waiting to run
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, -ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-vulkan-ffmpeg-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan-ffmpeg-core) (push) Waiting to run
Security Scan / tests (push) Waiting to run
Tests extras backends / tests-transformers (push) Waiting to run
Tests extras backends / tests-sentencetransformers (push) Waiting to run
Tests extras backends / tests-rerankers (push) Waiting to run
Tests extras backends / tests-diffusers (push) Waiting to run
Tests extras backends / tests-parler-tts (push) Waiting to run
Tests extras backends / tests-openvoice (push) Waiting to run
Tests extras backends / tests-transformers-musicgen (push) Waiting to run
Tests extras backends / tests-vallex (push) Waiting to run
Tests extras backends / tests-coqui (push) Waiting to run
tests / tests-linux (1.21.x) (push) Waiting to run
tests / tests-aio-container (push) Waiting to run
tests / tests-apple (1.21.x) (push) Waiting to run
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-01-09 09:33:22 +01:00
cea5a0ea42
feat(template): read jinja templates from gguf files ( #4332 )
...
* Read jinja templates as fallback
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Move templating out of model loader
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Test TemplateMessages
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Set role and content from transformers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Tests: be more flexible
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* More jinja
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Small refactoring and adaptations
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-12-08 13:50:33 +01:00
2b62260b6d
feat(models): use rwkv from llama.cpp ( #4264 )
...
feat(rwkv): use rwkv from llama.cpp
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-11-26 14:22:55 +01:00
76d813ed1c
chore(go.mod): tidy ( #4209 )
...
chore(go.mod): tidy
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-11-20 18:13:42 +01:00
b1ea9318e6
feat(silero): add Silero-vad backend ( #4204 )
...
* feat(vad): add silero-vad backend (WIP)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(vad): add API endpoint
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix(vad): correctly place the onnxruntime libs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* chore(vad): hook silero-vad to binary and container builds
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(gRPC): register VAD Server
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix(Makefile): consume ONNX_OS consistently
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix(Makefile): handle macOS
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-11-20 14:48:40 +01:00
9892d7d584
feat(p2p): add support for configuration of edgevpn listen_maddrs, dht_announce_maddrs and bootstrap_peers ( #4200 )
...
* add support for edgevpn listen_maddrs, dht_announce_maddrs, dht_bootstrap_peers
* upd docs for libp2p loglevel
2024-11-20 14:18:52 +01:00
6fd0341eca
chore: update go-piper to latest ( #3939 )
...
Signed-off-by: Dave Lee <dave@gray101.com >
2024-10-23 11:16:38 +02:00
90cacb9692
test: preliminary tests and merge fix for authv2 ( #3584 )
...
* add api key to existing app tests, add preliminary auth test
Signed-off-by: Dave Lee <dave@gray101.com >
* small fix, run test
Signed-off-by: Dave Lee <dave@gray101.com >
* status on non-opaque
Signed-off-by: Dave Lee <dave@gray101.com >
* tweak auth error
Signed-off-by: Dave Lee <dave@gray101.com >
* exp
Signed-off-by: Dave Lee <dave@gray101.com >
* quick fix on real laptop
Signed-off-by: Dave Lee <dave@gray101.com >
* add downloader version that allows providing an auth header
Signed-off-by: Dave Lee <dave@gray101.com >
* stash some devcontainer fixes during testing
Signed-off-by: Dave Lee <dave@gray101.com >
* s2
Signed-off-by: Dave Lee <dave@gray101.com >
* s
Signed-off-by: Dave Lee <dave@gray101.com >
* done with experiment
Signed-off-by: Dave Lee <dave@gray101.com >
* done with experiment
Signed-off-by: Dave Lee <dave@gray101.com >
* after merge fix
Signed-off-by: Dave Lee <dave@gray101.com >
* rename and fix
Signed-off-by: Dave Lee <dave@gray101.com >
---------
Signed-off-by: Dave Lee <dave@gray101.com >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-09-24 09:32:48 +02:00
db1159b651
feat: auth v2 - supersedes #2894 ( #3476 )
...
feat: auth v2 - supercedes #2894 , metrics to follow later
Signed-off-by: Dave Lee <dave@gray101.com >
2024-09-16 23:29:07 -04:00
6a6094a58d
chore(deps): update edgevpn to v0.28.3
2024-08-27 17:29:32 +02:00
8369614b6e
chore(deps): update edgevpn to v0.28.2
2024-08-27 13:03:16 +02:00
cac472d4a1
chore(deps): update edgevpn to v0.28 ( #3412 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-27 10:48:55 +02:00
18dddc1ae0
chore(deps): update edgevpn ( #3385 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-26 20:19:27 +02:00
ac5f6f210b
feat: external backend launching log improvements and relative path support ( #3348 )
...
* specify workdir when launching external backend for safety / relative paths, bump version, logs
Signed-off-by: Dave Lee <dave@gray101.com >
* sneak in a devcontainer fix
Signed-off-by: Dave Lee <dave@gray101.com >
---------
Signed-off-by: Dave Lee <dave@gray101.com >
2024-08-24 00:27:14 +02:00
70e53bc191
chore(deps): update edgevpn
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-21 10:18:43 +02:00
16f7140461
chore(deps): update edgevpn ( #3346 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-20 22:54:16 +02:00
a28b3771a7
chore(deps): update edgevpn ( #3340 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-20 19:17:35 +02:00
9729d2ae37
feat(explorer): make possible to run sync in a separate process ( #3224 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-12 19:25:44 +02:00
c4534cd908
chore(deps): update edgevpn ( #3214 )
...
* chore(deps): update edgevpn
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix: initialize failure map
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-11 10:46:17 +02:00
1f7cedf5ee
build: fix go.mod - don't import ourself ( #2896 )
...
* minor cleanup to go.mod - importing ourself?
Signed-off-by: Dave Lee <dave@gray101.com >
* figured out why we were importing ourself and fixed it
Signed-off-by: Dave Lee <dave@gray101.com >
* set pull_request_target
Signed-off-by: Dave Lee <dave@gray101.com >
---------
Signed-off-by: Dave Lee <dave@gray101.com >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-07-16 22:49:43 +02:00
fc60031ac1
chore: update edgevpn dependency ( #2855 )
...
deps: update edgevpn dependency
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-07-13 23:26:17 +00:00
fc87507012
chore(deps): Update Dependencies ( #2538 )
...
* chore(deps): Update dependencies
Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com >
* chore(deps): Upgrade github.com/imdario/mergo to dario.cat/mergo
Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com >
* remove version identifiers for MeloTTS
Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com >
---------
Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com >
Signed-off-by: Dave <dave@gray101.com >
Co-authored-by: Dave <dave@gray101.com >
2024-07-12 19:54:08 +00:00
5866fc8ded
chore: fix go.mod module ( #2635 )
...
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
2024-06-23 08:24:36 +00:00
f569237a50
feat(oci): support OCI images and Ollama models ( #2628 )
...
* Support specifying oci:// and ollama:// for model URLs
Fixes: https://github.com/mudler/LocalAI/issues/2527
Fixes: https://github.com/mudler/LocalAI/issues/1028
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Lower watcher warnings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Allow to install ollama models from CLI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fixup tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Do not keep file ownership
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Skip test on darwin
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-22 08:17:41 +02:00
aae7ad9d73
feat(llama.cpp): guess model defaults from file ( #2522 )
...
* wip: guess informations from gguf file
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* update go mod
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Small fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Identify llama3
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Do not try to guess the name, as reading gguf files can be expensive
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Allow to disable guessing
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-08 22:13:02 +02:00
4e1463fec2
feat: fiber CSRF ( #2482 )
...
new config option - enables or disables the fiber csrf middleware
Signed-off-by: Dave Lee <dave@gray101.com >
2024-06-04 19:43:46 +00:00
34ab442ce9
toil: bump grpc version ( #2480 )
...
bump the grpc package version
---------
Signed-off-by: Dave Lee <dave@gray101.com >
2024-06-04 08:39:19 +02:00
fdb45153fe
feat(llama.cpp): Totally decentralized, private, distributed, p2p inference ( #2343 )
...
* feat(llama.cpp): Enable decentralized, distributed inference
As https://github.com/mudler/LocalAI/pull/2324 introduced distributed inferencing thanks to
@rgerganov implementation in https://github.com/ggerganov/llama.cpp/pull/6829 in upstream llama.cpp, now
it is possible to distribute the workload to remote llama.cpp gRPC server.
This changeset now uses mudler/edgevpn to establish a secure, distributed network between the nodes using a shared token.
The token is generated automatically when starting the server with the `--p2p` flag, and can be used by starting the workers
with `local-ai worker p2p-llama-cpp-rpc` by passing the token via environment variable (TOKEN) or with args (--token).
As per how mudler/edgevpn works, a network is established between the server and the workers with dht and mdns discovery protocols,
the llama.cpp rpc server is automatically started and exposed to the underlying p2p network so the API server can connect on.
When the HTTP server is started, it will discover the workers in the network and automatically create the port-forwards to the service locally.
Then llama.cpp is configured to use the services.
This feature is behind the "p2p" GO_FLAGS
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* go mod tidy
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* ci: add p2p tag
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* better message
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-20 19:17:59 +02:00
e2c3ffb09b
feat: auto select llama-cpp cpu variant ( #2305 )
...
* auto select cpu variant
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* remove cuda target for now
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* fix metal
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* fix path
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
---------
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
2024-05-13 11:37:52 +02:00
b69ff46c7e
feat(startup): show CPU/GPU information with --debug ( #2241 )
...
Signed-off-by: mudler <mudler@localai.io >
2024-05-05 09:10:23 +02:00
006306b183
fix: use bluemonday as recommended by blackfriday ( #2142 )
...
use bluemonday as recommended by blackfriday
Signed-off-by: Dave Lee <dave@gray101.com >
2024-04-26 10:34:50 +02:00
0d8bf91699
feat: Galleries UI ( #2104 )
...
* WIP: add models to webui
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Register routes
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix: don't cache models
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* small fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix: fixup multiple installs (strings.Clone)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-04-23 09:22:58 +02:00
8d30b39811
feat: fiber logs with zerlog and add trace level ( #2082 )
...
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-04-20 10:43:37 +02:00
da82ce81b5
build(deps): bump github.com/opencontainers/runc from 1.1.5 to 1.1.12 ( #2000 )
...
Bumps [github.com/opencontainers/runc](https://github.com/opencontainers/runc ) from 1.1.5 to 1.1.12.
- [Release notes](https://github.com/opencontainers/runc/releases )
- [Changelog](https://github.com/opencontainers/runc/blob/main/CHANGELOG.md )
- [Commits](https://github.com/opencontainers/runc/compare/v1.1.5...v1.1.12 )
---
updated-dependencies:
- dependency-name: github.com/opencontainers/runc
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-11 18:57:33 +00:00
cbda06fb96
build(deps): bump github.com/gofiber/fiber/v2 from 2.52.0 to 2.52.4 ( #2008 )
...
Bumps [github.com/gofiber/fiber/v2](https://github.com/gofiber/fiber ) from 2.52.0 to 2.52.4.
- [Release notes](https://github.com/gofiber/fiber/releases )
- [Commits](https://github.com/gofiber/fiber/compare/v2.52.0...v2.52.4 )
---
updated-dependencies:
- dependency-name: github.com/gofiber/fiber/v2
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-11 16:52:54 +00:00
fce606fc0f
build(deps): bump github.com/charmbracelet/glamour from 0.6.0 to 0.7.0 ( #2004 )
...
Bumps [github.com/charmbracelet/glamour](https://github.com/charmbracelet/glamour ) from 0.6.0 to 0.7.0.
- [Release notes](https://github.com/charmbracelet/glamour/releases )
- [Commits](https://github.com/charmbracelet/glamour/compare/v0.6.0...v0.7.0 )
---
updated-dependencies:
- dependency-name: github.com/charmbracelet/glamour
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-11 15:41:58 +00:00
fdfd868953
build(deps): bump github.com/gofiber/fiber/v2 from 2.52.0 to 2.52.1 ( #2001 )
...
Bumps [github.com/gofiber/fiber/v2](https://github.com/gofiber/fiber ) from 2.52.0 to 2.52.1.
- [Release notes](https://github.com/gofiber/fiber/releases )
- [Commits](https://github.com/gofiber/fiber/compare/v2.52.0...v2.52.1 )
---
updated-dependencies:
- dependency-name: github.com/gofiber/fiber/v2
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-11 12:21:52 +00:00
0795975486
build(deps): bump github.com/docker/docker from 20.10.7+incompatible to 24.0.9+incompatible ( #1999 )
...
build(deps): bump github.com/docker/docker
Bumps [github.com/docker/docker](https://github.com/docker/docker ) from 20.10.7+incompatible to 24.0.9+incompatible.
- [Release notes](https://github.com/docker/docker/releases )
- [Commits](https://github.com/docker/docker/compare/v20.10.7...v24.0.9 )
---
updated-dependencies:
- dependency-name: github.com/docker/docker
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-11 11:44:34 +00:00
a49248d29f
build(deps): bump google.golang.org/protobuf from 1.31.0 to 1.33.0 ( #1998 )
...
Bumps google.golang.org/protobuf from 1.31.0 to 1.33.0.
---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-11 11:07:45 +00:00
24d7dadfed
feat: kong cli refactor fixes #1955 ( #1974 )
...
* feat: migrate to alecthomas/kong for CLI
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: bring in new flag for granular log levels
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* chore: go mod tidy
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: allow loading cli flag values from ["./localai.yaml", "~/.config/localai.yaml", "/etc/localai.yaml"] in that order
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: load from .env file instead of a yaml file
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: better loading for environment files
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat(doc): add initial documentation about configuration
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: remove test log lines
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: integrate new documentation into existing pages
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: add documentation on .env files
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: cleanup some documentation table errors
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: refactor CLI logic out to it's own package under core/cli
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
---------
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-04-11 09:19:24 +02:00
123a5a2e16
feat(swagger): Add swagger API doc ( #1926 )
...
* makefile(build): add minimal and api build target
* feat(swagger): Add swagger
2024-03-29 22:29:33 +01:00
bf65ed6eb8
feat(webui): add partials, show backends associated to models ( #1922 )
...
* feat(webui): add partials, show backends associated to models
* fix(auth): put assistant and backend under auth
2024-03-28 21:52:52 +01:00