21bdfe5fa4
fix: use rice when embedding large binaries ( #5309 )
...
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f16) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f32) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda11) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda12) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, latest-gpu-intel-f16-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-core) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, latest-gpu-intel-f32-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-core) (push) Has been cancelled
build container images / self-hosted-jobs (ubuntu:22.04, , true, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, ) (push) Has been cancelled
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, -core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, latest-gpu-nvidia-cuda-12-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, latest-gpu-nvidia-cuda-12-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-gpu-vulkan-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan-core) (push) Has been cancelled
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64-core, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64-core) (push) Has been cancelled
Security Scan / tests (push) Has been cancelled
Tests extras backends / tests-transformers (push) Has been cancelled
Tests extras backends / tests-rerankers (push) Has been cancelled
Tests extras backends / tests-diffusers (push) Has been cancelled
Tests extras backends / tests-coqui (push) Has been cancelled
tests / tests-linux (1.21.x) (push) Has been cancelled
tests / tests-aio-container (push) Has been cancelled
tests / tests-apple (1.21.x) (push) Has been cancelled
Update swagger / swagger (push) Has been cancelled
Check if checksums are up-to-date / checksum_check (push) Has been cancelled
Bump dependencies / bump (mudler/LocalAI) (push) Has been cancelled
Bump dependencies / bump (main, PABannier/bark.cpp, BARKCPP_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, ggerganov/whisper.cpp, WHISPER_CPP_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, ggml-org/llama.cpp, CPPLLAMA_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, leejet/stable-diffusion.cpp, STABLEDIFFUSION_GGML_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, mudler/go-piper, PIPER_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, mudler/go-stable-diffusion, STABLEDIFFUSION_VERSION) (push) Has been cancelled
generate and publish GRPC docker caches / generate_caches (ubuntu:22.04, linux/amd64,linux/arm64, arc-runner-set) (push) Has been cancelled
* fix(embed): use go-rice for large backend assets
Golang embed FS has a hard limit that we might exceed when providing
many binary alternatives.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* simplify golang deps
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* chore(tests): switch to testcontainers and print logs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix(tests): do not build a test binary
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* small fixup
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-05-04 16:42:42 +02:00
5c6cd50ed6
feat(llama.cpp): estimate vram usage ( #5299 )
...
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f16) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f32) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda11) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda12) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, latest-gpu-intel-f16-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-core) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, latest-gpu-intel-f32-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-core) (push) Has been cancelled
build container images / self-hosted-jobs (ubuntu:22.04, , true, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, ) (push) Has been cancelled
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, -core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, latest-gpu-nvidia-cuda-12-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, latest-gpu-nvidia-cuda-12-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-gpu-vulkan-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan-core) (push) Has been cancelled
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64-core, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64-core) (push) Has been cancelled
Security Scan / tests (push) Has been cancelled
Tests extras backends / tests-transformers (push) Has been cancelled
Tests extras backends / tests-rerankers (push) Has been cancelled
Tests extras backends / tests-diffusers (push) Has been cancelled
Tests extras backends / tests-coqui (push) Has been cancelled
tests / tests-linux (1.21.x) (push) Has been cancelled
tests / tests-aio-container (push) Has been cancelled
tests / tests-apple (1.21.x) (push) Has been cancelled
Update swagger / swagger (push) Has been cancelled
Check if checksums are up-to-date / checksum_check (push) Has been cancelled
Bump dependencies / bump (mudler/LocalAI) (push) Has been cancelled
Bump dependencies / bump (main, PABannier/bark.cpp, BARKCPP_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, ggerganov/whisper.cpp, WHISPER_CPP_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, ggml-org/llama.cpp, CPPLLAMA_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, leejet/stable-diffusion.cpp, STABLEDIFFUSION_GGML_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, mudler/go-piper, PIPER_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, mudler/go-stable-diffusion, STABLEDIFFUSION_VERSION) (push) Has been cancelled
generate and publish GRPC docker caches / generate_caches (ubuntu:22.04, linux/amd64,linux/arm64, arc-runner-set) (push) Has been cancelled
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-05-02 17:40:26 +02:00
b6e3dc5f02
docs: update docs for DisableWebUI flag ( #5256 )
...
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, core, latest-gpu-hipblas-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas-core) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f16) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f32) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda11) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda12) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, latest-gpu-intel-f16-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-core) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, latest-gpu-intel-f32-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-core) (push) Has been cancelled
build container images / self-hosted-jobs (ubuntu:22.04, , true, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, ) (push) Has been cancelled
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, -core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, latest-gpu-nvidia-cuda-12-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, latest-gpu-nvidia-cuda-12-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-gpu-vulkan-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan-core) (push) Has been cancelled
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64-core, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64-core) (push) Has been cancelled
Security Scan / tests (push) Has been cancelled
Tests extras backends / tests-transformers (push) Has been cancelled
Tests extras backends / tests-rerankers (push) Has been cancelled
Tests extras backends / tests-diffusers (push) Has been cancelled
Tests extras backends / tests-coqui (push) Has been cancelled
tests / tests-linux (1.21.x) (push) Has been cancelled
tests / tests-aio-container (push) Has been cancelled
tests / tests-apple (1.21.x) (push) Has been cancelled
Update swagger / swagger (push) Has been cancelled
Check if checksums are up-to-date / checksum_check (push) Has been cancelled
Bump dependencies / bump (mudler/LocalAI) (push) Has been cancelled
Bump dependencies / bump (main, PABannier/bark.cpp, BARKCPP_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, ggerganov/whisper.cpp, WHISPER_CPP_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, ggml-org/llama.cpp, CPPLLAMA_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, leejet/stable-diffusion.cpp, STABLEDIFFUSION_GGML_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, mudler/go-piper, PIPER_VERSION) (push) Has been cancelled
Bump dependencies / bump (master, mudler/go-stable-diffusion, STABLEDIFFUSION_VERSION) (push) Has been cancelled
Signed-off-by: Mohit Gaur <56885276+Mohit-Gaur@users.noreply.github.com >
2025-04-27 16:02:02 +02:00
2c9279a542
feat(video-gen): add endpoint for video generation ( #5247 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-04-26 18:05:01 +02:00
2c425e9c69
feat(loader): enhance single active backend by treating as singleton ( #5107 )
...
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas-core) (push) Waiting to run
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, false, ubuntu:22.04, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas) (push) Waiting to run
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f16-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f32-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda11-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda12-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, , , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, ) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, , true, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 11, 7, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda11) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 12, 0, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda12) (push) Waiting to run
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, -ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-vulkan-ffmpeg-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan-ffmpeg-core) (push) Waiting to run
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64-core, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64-core) (push) Waiting to run
Security Scan / tests (push) Waiting to run
Tests extras backends / tests-transformers (push) Waiting to run
Tests extras backends / tests-rerankers (push) Waiting to run
Tests extras backends / tests-diffusers (push) Waiting to run
Tests extras backends / tests-coqui (push) Waiting to run
tests / tests-linux (1.21.x) (push) Waiting to run
tests / tests-aio-container (push) Waiting to run
tests / tests-apple (1.21.x) (push) Waiting to run
feat(loader): enhance single active backend by treating at singleton
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-04-01 20:58:11 +02:00
9c74d74f7b
feat(gguf): guess default context size from file ( #5089 )
...
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas-core) (push) Waiting to run
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, false, ubuntu:22.04, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas) (push) Waiting to run
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f16-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f32-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda11-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda12-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, , , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, ) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, , true, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 11, 7, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda11) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 12, 0, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda12) (push) Waiting to run
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, -ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-vulkan-ffmpeg-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan-ffmpeg-core) (push) Waiting to run
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64-core, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64-core) (push) Waiting to run
Security Scan / tests (push) Waiting to run
Tests extras backends / tests-transformers (push) Waiting to run
Tests extras backends / tests-rerankers (push) Waiting to run
Tests extras backends / tests-diffusers (push) Waiting to run
Tests extras backends / tests-coqui (push) Waiting to run
tests / tests-linux (1.21.x) (push) Waiting to run
tests / tests-aio-container (push) Waiting to run
tests / tests-apple (1.21.x) (push) Waiting to run
feat(gguf): guess default config file from files
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-03-29 14:42:14 +01:00
3cddf24747
feat: Centralized Request Processing middleware ( #3847 )
...
* squash past, centralize request middleware PR
Signed-off-by: Dave Lee <dave@gray101.com >
* migrate bruno request files to examples repo
Signed-off-by: Dave Lee <dave@gray101.com >
* fix
Signed-off-by: Dave Lee <dave@gray101.com >
* Update tests/e2e-aio/e2e_test.go
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
---------
Signed-off-by: Dave Lee <dave@gray101.com >
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2025-02-10 12:06:16 +01:00
72e52c4f6a
chore: drop embedded models ( #4715 )
...
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas-core) (push) Waiting to run
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, false, ubuntu:22.04, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas) (push) Waiting to run
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f16-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f32-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda11-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda12-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, , , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, ) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, , true, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 11, 7, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda11) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 12, 0, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda12) (push) Waiting to run
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, -ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-vulkan-ffmpeg-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan-ffmpeg-core) (push) Waiting to run
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64-core, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64-core) (push) Waiting to run
Security Scan / tests (push) Waiting to run
Tests extras backends / tests-transformers (push) Waiting to run
Tests extras backends / tests-rerankers (push) Waiting to run
Tests extras backends / tests-diffusers (push) Waiting to run
Tests extras backends / tests-coqui (push) Waiting to run
tests / tests-linux (1.21.x) (push) Waiting to run
tests / tests-aio-container (push) Waiting to run
tests / tests-apple (1.21.x) (push) Waiting to run
Since the remote gallery was introduced this is now completely
superseded by it. In order to keep the code clean and remove redudant
parts let's simplify the usage.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-01-30 00:03:01 +01:00
96306a39a0
chore(docs): extra-Usage and Machine-Tag docs ( #4627 )
...
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f16-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f32-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda11-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda12-ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-core) (push) Waiting to run
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-ffmpeg-core) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, , , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, ) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, , true, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -ffmpeg) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 11, 7, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda11) (push) Waiting to run
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 12, 0, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda12) (push) Waiting to run
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, -ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-ffmpeg-core) (push) Waiting to run
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-vulkan-ffmpeg-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan-ffmpeg-core) (push) Waiting to run
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64-core, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64-core) (push) Waiting to run
Security Scan / tests (push) Waiting to run
Tests extras backends / tests-transformers (push) Waiting to run
Tests extras backends / tests-sentencetransformers (push) Waiting to run
Tests extras backends / tests-rerankers (push) Waiting to run
Tests extras backends / tests-diffusers (push) Waiting to run
Tests extras backends / tests-parler-tts (push) Waiting to run
Tests extras backends / tests-openvoice (push) Waiting to run
Tests extras backends / tests-coqui (push) Waiting to run
tests / tests-linux (1.21.x) (push) Waiting to run
tests / tests-aio-container (push) Waiting to run
tests / tests-apple (1.21.x) (push) Waiting to run
Rename LocalAI-Extra-Usage -> Extra-Usage, add MACHINE_TAG as cli flag option, add docs about extra-usage and machine-tag
Signed-off-by: mintyleaf <mintyleafdev@gmail.com >
2025-01-18 08:58:38 +01:00
96f8ec0402
feat: add machine tag and inference timings ( #4577 )
...
* Add machine tag option, add extraUsage option, grpc-server -> proto -> endpoint extraUsage data is broken for now
Signed-off-by: mintyleaf <mintyleafdev@gmail.com >
* remove redurant timing fields, fix not working timings output
Signed-off-by: mintyleaf <mintyleafdev@gmail.com >
* use middleware for Machine-Tag only if tag is specified
Signed-off-by: mintyleaf <mintyleafdev@gmail.com >
---------
Signed-off-by: mintyleaf <mintyleafdev@gmail.com >
2025-01-17 17:05:58 +01:00
cea5a0ea42
feat(template): read jinja templates from gguf files ( #4332 )
...
* Read jinja templates as fallback
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Move templating out of model loader
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Test TemplateMessages
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Set role and content from transformers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Tests: be more flexible
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* More jinja
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Small refactoring and adaptations
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-12-08 13:50:33 +01:00
f028ee8a26
fix(p2p): parse correctly ExtraLLamaCPPArgs ( #4220 )
...
Previously we were sensible when args aren't defined and we would clash
parsing extra args.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-11-21 15:17:48 +01:00
8737a65760
feat: allow to disable '/metrics' endpoints for local stats ( #3945 )
...
Seem the "/metrics" endpoint that is source of confusion as people tends
to believe we collect telemetry data just because we import
"opentelemetry", however it is still a good idea to allow to disable
even local metrics if not really required.
See also: https://github.com/mudler/LocalAI/issues/3942
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-10-23 15:34:32 +02:00
0965c6cd68
feat: track internally started models by ID ( #3693 )
...
* chore(refactor): track internally started models by ID
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Just extend options, no need to copy
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Improve debugging for rerankers failures
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Simplify model loading with rerankers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Be more consistent when generating model options
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Uncommitted code
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Make deleteProcess more idiomatic
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Adapt CLI for sound generation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Fixup threads definition
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Handle corner case where c.Seed is nil
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Consistently use ModelOptions
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Adapt new code to refactoring
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
Co-authored-by: Dave <dave@gray101.com >
2024-10-02 08:55:58 +02:00
307a835199
groundwork: ListModels Filtering Upgrade ( #2773 )
...
* seperate the filtering from the middleware changes
---------
Signed-off-by: Dave Lee <dave@gray101.com >
2024-10-01 18:55:46 +00:00
ee21b00a8d
feat: auto load into memory on startup ( #3627 )
...
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
2024-09-22 10:03:30 +02:00
db1159b651
feat: auth v2 - supersedes #2894 ( #3476 )
...
feat: auth v2 - supercedes #2894 , metrics to follow later
Signed-off-by: Dave Lee <dave@gray101.com >
2024-09-16 23:29:07 -04:00
11d960b2a6
chore(cli): be consistent between workers and expose ExtraLLamaCPPArgs to both ( #3428 )
...
* chore(cli): be consistent between workers and expose ExtraLLamaCPPArgs to both
Fixes: https://github.com/mudler/LocalAI/issues/3427
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* bump grpcio
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-30 00:10:17 +02:00
ce827139bb
fix(p2p): correctly allow to pass extra args to llama.cpp ( #3368 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-24 10:30:24 +02:00
81ae92f017
feat: elevenlabs sound-generation
api ( #3355 )
...
* initial version of elevenlabs compatible soundgeneration api and cli command
Signed-off-by: Dave Lee <dave@gray101.com >
* minor cleanup
Signed-off-by: Dave Lee <dave@gray101.com >
* restore TTS, add test
Signed-off-by: Dave Lee <dave@gray101.com >
* remove stray s
Signed-off-by: Dave Lee <dave@gray101.com >
* fix
Signed-off-by: Dave Lee <dave@gray101.com >
---------
Signed-off-by: Dave Lee <dave@gray101.com >
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-08-24 00:20:28 +00:00
023ce59d44
feat(p2p): allow to set intervals ( #3353 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-21 18:23:51 +02:00
7822d944b5
chore(p2p): single-node when sharing federated instance ( #3354 )
...
* chore(p2p): single-node when sharing federated instance
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* chore: refactor out and extract into functions
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-21 18:23:42 +02:00
af095204fa
fix(p2p): avoid starting the node twice ( #3349 )
...
* fix(p2p): avoid starting the node twice
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix(p2p): keep exposing service if we don't start the llama.cpp runner
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-21 10:30:56 +02:00
2669f4738a
fix(p2p): re-use p2p host when running federated mode ( #3341 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-20 20:14:17 +02:00
27b03a52f3
fix(p2p): allocate tunnels only when needed ( #3259 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-17 15:03:55 +02:00
7278bf3de8
chore: allow to disable gallery endpoints, improve p2p connection handling ( #3256 )
...
* Add more debug messages
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat: allow to disable gallery endpoints
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* improve p2p messaging
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* improve error handling
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Make sure to close the listening socket when context is exhausted
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-17 08:28:52 +02:00
02de274e00
feat(federated): allow to pickup a specific worker, improve loadbalancing ( #3243 )
...
* feat(explorer): allow to specify a worker target
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(explorer): correctly load balance requests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(explorer): mark load balanced by default
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix: make sure to delete tunnels that might not exist anymore
If a worker goes off and on might change tunnel address, and we want to
load balance only on the active tunnels.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-13 16:17:18 +02:00
9729d2ae37
feat(explorer): make possible to run sync in a separate process ( #3224 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-12 19:25:44 +02:00
8627bc2dd4
feat(explorer): relax token deletion with error threshold ( #3211 )
...
feat(explorer): relax token deletion with error threashold
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-08-10 20:50:57 +02:00
9e3e892ac7
feat(p2p): add network explorer and community pools ( #3125 )
...
* WIP
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Wire up a simple explorer DB
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* wip
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* WIP
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* refactor: group services id so can be identified easily in the ledger table
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(discovery): discovery service now gather worker informations correctly
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(explorer): display network token
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(explorer): display form to add new networks
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(explorer): stop from overwriting networks
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(explorer): display only networks with active workers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(explorer): list only clusters in a network if it has online workers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* remove invalid and inactive networks
if networks have no workers delete them from the database, similarly,
if invalid.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* ci: add workflow to deploy new explorer versions automatically
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* build-api: build with p2p tag
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Allow to specify a connection timeout
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* logging
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Better p2p defaults
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Set loglevel
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Fix dht enable
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Default to info for loglevel
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Add navbar
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Slightly improve rendering
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Allow to copy the token easily
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* ci fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-09 20:12:01 +02:00
8814b31805
chore: drop gpt4all.cpp ( #3106 )
...
chore: drop gpt4all
gpt4all is already supported in llama.cpp - the backend was kept for
keeping compatibility with old gpt4all models (prior to gguf format).
It is good time now to clean up and remove it to slim the compilation
process.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-07 23:35:55 +02:00
36e185ba63
feat(p2p): allow to run multiple clusters in the same p2p network ( #3128 )
...
feat(p2p): allow to run multiple clusters in the same network
Allow to specify a network ID via CLI which allows to run multiple
clusters, logically separated within the same network (by using the same
shared token).
Note: This segregation is not "secure" by any means, anyone having the
network token can see the services available in all the network,
however, this provides a way to separate the inference endpoints.
This allows for instance to have a node which is both federated and
having attached a set of llama.cpp workers.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-07 23:35:44 +02:00
a36b721ca6
fix: be consistent in downloading files, check for scanner errors ( #3108 )
...
* fix(downloader): be consistent in downloading files
This PR puts some order in the downloader such as functions are re-used
across several places.
This fixes an issue with having uri's inside the model YAML file, it
would resolve to MD5 rather then using the filename
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix(scanner): do raise error only if unsafeFiles are found
Fixes: https://github.com/mudler/LocalAI/issues/3114
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-02 20:06:25 +02:00
252961751c
feat(federation): add load balanced option ( #2915 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-07-18 23:18:53 +02:00
24a8eebcef
refactor: move federated server logic to its own service ( #2914 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-07-18 19:15:15 +02:00
c7357a9872
fix: short-circuit when nodes aren't detected ( #2909 )
...
Fixes:
```
panic: invalid argument to IntN
goroutine 401 [running]:
math/rand/v2.(*Rand).IntN(...)
/home/mudler/_git/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.4.linux-amd64/src/math/rand/v2/rand.go:190
math/rand/v2.IntN(...)
/home/mudler/_git/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.4.linux-amd64/src/math/rand/v2/rand.go:307
github.com/mudler/LocalAI/core/cli.Proxy.func2()
/home/mudler/_git/LocalAI/core/cli/federated.go:104 +0x76e
created by github.com/mudler/LocalAI/core/cli.Proxy in goroutine 1
/home/mudler/_git/LocalAI/core/cli/federated.go:91 +0x3c5
```
When no nodes are found and something is trying to hit the federated
endpoint (and no tunnels are ready yet).
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-07-18 14:44:31 +02:00
133987b1fb
feat: HF /scan
endpoint ( #2566 )
...
* start by checking /scan during the checksum update
Signed-off-by: Dave Lee <dave@gray101.com >
* add back in golang side features: downloader/uri gets struct and scan function, gallery uses it, and secscan/models calls it.
Signed-off-by: Dave Lee <dave@gray101.com >
* add a param to scan specific urls - useful for debugging
Signed-off-by: Dave Lee <dave@gray101.com >
* helpful printouts
Signed-off-by: Dave Lee <dave@gray101.com >
* fix offsets
Signed-off-by: Dave Lee <dave@gray101.com >
* fix error and naming
Signed-off-by: Dave Lee <dave@gray101.com >
* expose error
Signed-off-by: Dave Lee <dave@gray101.com >
* fix json tags
Signed-off-by: Dave Lee <dave@gray101.com >
* slight wording change
Signed-off-by: Dave Lee <dave@gray101.com >
* go mod tidy - getting warnings
Signed-off-by: Dave Lee <dave@gray101.com >
* split out python to make editing easier, add some simple code to delete contaminated entries from gallery
Signed-off-by: Dave Lee <dave@gray101.com >
* o7 to my favorite part of our old name, go-skynet
Signed-off-by: Dave Lee <dave@gray101.com >
* merge fix
Signed-off-by: Dave Lee <dave@gray101.com >
* merge fix
Signed-off-by: Dave Lee <dave@gray101.com >
* merge fix
Signed-off-by: Dave Lee <dave@gray101.com >
* address review comments
Signed-off-by: Dave Lee <dave@gray101.com >
* forgot secscan could accept multiple URL at once
Signed-off-by: Dave Lee <dave@gray101.com >
* invert naming and actually use it
Signed-off-by: Dave Lee <dave@gray101.com >
* missed cli/models.go
Signed-off-by: Dave Lee <dave@gray101.com >
* Update .github/check_and_update.py
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
Signed-off-by: Dave <dave@gray101.com >
---------
Signed-off-by: Dave Lee <dave@gray101.com >
Signed-off-by: Dave <dave@gray101.com >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-07-10 13:18:32 +02:00
cca881ec49
feat(p2p): Federation and AI swarms ( #2723 )
...
* Wip p2p enhancements
* get online state
* Pass-by token to show in the dashboard
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Style
* Minor fixups
* parametrize SearchID
* Refactoring
* Allow to expose/bind more services
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Add federation
* Display federated mode in the WebUI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Small fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* make federated nodes visible from the WebUI
* Fix version display
* improve web page
* live page update
* visual enhancements
* enhancements
* visual enhancements
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-07-08 22:04:06 +02:00
f072cb3cd0
fix(cli): remove duplicate alias ( #2654 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-25 10:08:13 +02:00
03b1cf51fd
feat(whisper): add translate option ( #2649 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-24 19:21:22 +02:00
a181dd0ebc
refactor: gallery inconsistencies ( #2647 )
...
* refactor(gallery): move under core/
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix(unarchive): do not allow symlinks
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-24 17:32:12 +02:00
5866fc8ded
chore: fix go.mod module ( #2635 )
...
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
2024-06-23 08:24:36 +00:00
8d84dd4f88
fix(worker): use dynaload for single binaries ( #2620 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-22 09:33:18 +02:00
f569237a50
feat(oci): support OCI images and Ollama models ( #2628 )
...
* Support specifying oci:// and ollama:// for model URLs
Fixes: https://github.com/mudler/LocalAI/issues/2527
Fixes: https://github.com/mudler/LocalAI/issues/1028
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Lower watcher warnings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Allow to install ollama models from CLI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fixup tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Do not keep file ownership
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Skip test on darwin
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-22 08:17:41 +02:00
89a11e15e7
fix(single-binary): bundle ld.so ( #2602 )
...
* debug
* fix copy command/silly muscle memory
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* remove tmate
* Debugging
* Start binary with ld.so if present in libdir
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* small refactor
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-18 22:43:43 +02:00
94cfaad7f4
feat(libpath): refactor and expose functions for external library paths ( #2578 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-16 13:58:28 +02:00
7b205510f9
feat(gallery): uniform download from CLI ( #2559 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-13 16:12:46 +02:00
882556d4db
feat(gallery): show available models in website, allow local-ai models install
to install from galleries ( #2555 )
...
* WIP
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* gen a static page instead (we force DNS redirects to it)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(gallery): install models from CLI, unify install
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Uniform graphic of model page
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Makefile: update targets
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Slightly enhance gallery view
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-13 00:47:16 +02:00
14b41be057
feat(detection): detect by template in gguf file, add qwen2, phi, mistral and chatml ( #2536 )
...
feat(detection): detect by template in gguf file, add qwen and chatml
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-10 22:58:04 +02:00
d7e137295a
feat(util): add util command to print GGUF informations ( #2528 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-06-09 19:27:42 +02:00