2e1dc8deef
Fix Typos in Comments and Error Messages ( #5637 )
...
Explorer deployment / build-linux (push) Has been cancelled
GPU tests / ubuntu-latest (1.21.x) (push) Has been cancelled
generate and publish intel docker caches / generate_caches (intel/oneapi-basekit:2025.1.0-0-devel-ubuntu22.04, linux/amd64, ubuntu-latest) (push) Has been cancelled
build container images / hipblas-jobs (-aio-gpu-hipblas, rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, extras, latest-gpu-hipblas-extras, latest-aio-gpu-hipblas, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -hipblas-extras) (push) Has been cancelled
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, core, latest-gpu-hipblas, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16-extras, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-… (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32-extras, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-… (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11-extras, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda11-extras) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12-extras, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda12-extras) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, latest-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, latest-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32) (push) Has been cancelled
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, ) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, latest-gpu-nvidia-cuda-12, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, latest-gpu-nvidia-cuda-12, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-gpu-vulkan, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan) (push) Has been cancelled
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64) (push) Has been cancelled
Security Scan / tests (push) Has been cancelled
Tests extras backends / tests-transformers (push) Has been cancelled
Tests extras backends / tests-rerankers (push) Has been cancelled
Tests extras backends / tests-diffusers (push) Has been cancelled
Tests extras backends / tests-coqui (push) Has been cancelled
tests / tests-linux (1.21.x) (push) Has been cancelled
tests / tests-aio-container (push) Has been cancelled
tests / tests-apple (1.21.x) (push) Has been cancelled
generate and publish GRPC docker caches / generate_caches (ubuntu:22.04, linux/amd64,linux/arm64, arc-runner-set) (push) Has been cancelled
* Update initializers.go
Signed-off-by: kilavvy <140459108+kilavvy@users.noreply.github.com >
* Update base.go
Signed-off-by: kilavvy <140459108+kilavvy@users.noreply.github.com >
---------
Signed-off-by: kilavvy <140459108+kilavvy@users.noreply.github.com >
2025-06-12 18:34:32 +02:00
bf6426aef2
feat: Realtime API support reboot ( #5392 )
...
Explorer deployment / build-linux (push) Has been cancelled
GPU tests / ubuntu-latest (1.21.x) (push) Has been cancelled
generate and publish intel docker caches / generate_caches (intel/oneapi-basekit:2025.1.0-0-devel-ubuntu22.04, linux/amd64, ubuntu-latest) (push) Has been cancelled
build container images / hipblas-jobs (-aio-gpu-hipblas, rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, extras, latest-gpu-hipblas-extras, latest-aio-gpu-hipblas, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -hipblas-extras) (push) Has been cancelled
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, core, latest-gpu-hipblas, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16-extras, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-… (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32-extras, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-… (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11-extras, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda11-extras) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12-extras, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda12-extras) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, latest-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, latest-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32) (push) Has been cancelled
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, ) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, latest-gpu-nvidia-cuda-12, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, latest-gpu-nvidia-cuda-12, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-gpu-vulkan, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan) (push) Has been cancelled
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64) (push) Has been cancelled
Security Scan / tests (push) Has been cancelled
Tests extras backends / tests-transformers (push) Has been cancelled
Tests extras backends / tests-rerankers (push) Has been cancelled
Tests extras backends / tests-diffusers (push) Has been cancelled
Tests extras backends / tests-coqui (push) Has been cancelled
tests / tests-linux (1.21.x) (push) Has been cancelled
tests / tests-aio-container (push) Has been cancelled
tests / tests-apple (1.21.x) (push) Has been cancelled
* feat(realtime): Initial Realtime API implementation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* chore: go mod tidy
Signed-off-by: Richard Palethorpe <io@richiejp.com >
* feat: Implement transcription only mode for realtime API
Reduce the scope of the real time API for the initial realease and make
transcription only mode functional.
Signed-off-by: Richard Palethorpe <io@richiejp.com >
* chore(build): Build backends on a separate layer to speed up core only changes
Signed-off-by: Richard Palethorpe <io@richiejp.com >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
Signed-off-by: Richard Palethorpe <io@richiejp.com >
Co-authored-by: Ettore Di Giacinto <mudler@localai.io >
2025-05-25 22:25:05 +02:00
2c9279a542
feat(video-gen): add endpoint for video generation ( #5247 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-04-26 18:05:01 +02:00
8abecb4a18
chore: bump grpc limits to 50MB ( #5212 )
...
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas-core) (push) Has been cancelled
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, false, ubuntu:22.04, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas) (push) Has been cancelled
build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, core, latest-gpu-hipblas-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas-ffmpeg-core) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f16-ffmpeg) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -sycl-f32-ffmpeg) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda11-ffmpeg) (push) Has been cancelled
build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -cublas-cuda12-ffmpeg) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-core) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, latest-gpu-intel-f16-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-ffmpeg-core) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, false, ubuntu:22.04, core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-core) (push) Has been cancelled
build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, latest-gpu-intel-f32-core, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-ffmpeg-core) (push) Has been cancelled
build container images / self-hosted-jobs (ubuntu:22.04, , , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, ) (push) Has been cancelled
build container images / self-hosted-jobs (ubuntu:22.04, , true, extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -ffmpeg) (push) Has been cancelled
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 11, 7, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda11) (push) Has been cancelled
build container images / self-hosted-jobs (ubuntu:22.04, cublas, 12, 0, , extras, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda12) (push) Has been cancelled
build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, -ffmpeg-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, latest-gpu-nvidia-cuda-12-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11-ffmpeg-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, , core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, latest-gpu-nvidia-cuda-12-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12-ffmpeg-core) (push) Has been cancelled
build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-gpu-vulkan-core, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan-ffmpeg-core) (push) Has been cancelled
build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64-core, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64-core) (push) Has been cancelled
Security Scan / tests (push) Has been cancelled
Tests extras backends / tests-transformers (push) Has been cancelled
Tests extras backends / tests-rerankers (push) Has been cancelled
Tests extras backends / tests-diffusers (push) Has been cancelled
Tests extras backends / tests-coqui (push) Has been cancelled
tests / tests-linux (1.21.x) (push) Has been cancelled
tests / tests-aio-container (push) Has been cancelled
tests / tests-apple (1.21.x) (push) Has been cancelled
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2025-04-19 08:53:24 +02:00
2bc4b56a79
feat: stream tokens usage ( #4415 )
...
* Use pb.Reply instead of []byte with Reply.GetMessage() in llama grpc to get the proper usage data in reply streaming mode at the last [DONE] frame
* Fix 'hang' on empty message from the start
Seems like that empty message marker trick was unnecessary
---------
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-12-18 09:48:50 +01:00
f943c4b803
Revert "feat: include tokens usage for streamed output" ( #4336 )
...
Revert "feat: include tokens usage for streamed output (#4282 )"
This reverts commit 0d6c3a7d57
.
2024-12-08 17:53:36 +01:00
0d6c3a7d57
feat: include tokens usage for streamed output ( #4282 )
...
Use pb.Reply instead of []byte with Reply.GetMessage() in llama grpc to get the proper usage data in reply streaming mode at the last [DONE] frame
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-11-28 14:47:56 +01:00
b1ea9318e6
feat(silero): add Silero-vad backend ( #4204 )
...
* feat(vad): add silero-vad backend (WIP)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(vad): add API endpoint
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix(vad): correctly place the onnxruntime libs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* chore(vad): hook silero-vad to binary and container builds
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(gRPC): register VAD Server
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix(Makefile): consume ONNX_OS consistently
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix(Makefile): handle macOS
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-11-20 14:48:40 +01:00
092bb0bd6b
fix(base-grpc): close channel in base grpc server ( #3734 )
...
If the LLM does not implement any logic for PredictStream, we close the
channel immediately to not leave the process hanging.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-10-05 15:14:27 +02:00
f84b55d1ef
feat: Add Get Token Metrics to GRPC server ( #3687 )
...
* Add Get Token Metrics to GRPC server
Signed-off-by: Siddharth More <siddimore@gmail.com >
* Expose LocalAI endpoint
Signed-off-by: Siddharth More <siddimore@gmail.com >
---------
Signed-off-by: Siddharth More <siddimore@gmail.com >
2024-10-01 14:41:20 +02:00
c2804c42fe
fix: untangle pkg/grpc and core/schema for Transcription ( #3419 )
...
untangle pkg/grpc and core/schema in Transcribe
Signed-off-by: Dave Lee <dave@gray101.com >
2024-09-02 15:48:53 +02:00
7f06954425
fix(model-loading): keep track of open GRPC Clients ( #3377 )
...
Due to a previous refactor we moved the client constructor tight to the
model address, however that was just a string which we would use to
build the client each time.
With this change we make the loader to return a *Model which carries a
constructor for the client and stores the client on the first
connection.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-08-25 14:36:09 +02:00
81ae92f017
feat: elevenlabs sound-generation
api ( #3355 )
...
* initial version of elevenlabs compatible soundgeneration api and cli command
Signed-off-by: Dave Lee <dave@gray101.com >
* minor cleanup
Signed-off-by: Dave Lee <dave@gray101.com >
* restore TTS, add test
Signed-off-by: Dave Lee <dave@gray101.com >
* remove stray s
Signed-off-by: Dave Lee <dave@gray101.com >
* fix
Signed-off-by: Dave Lee <dave@gray101.com >
---------
Signed-off-by: Dave Lee <dave@gray101.com >
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-08-24 00:20:28 +00:00
5866fc8ded
chore: fix go.mod module ( #2635 )
...
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
2024-06-23 08:24:36 +00:00
c4f958e11b
refactor(application): introduce application global state ( #2072 )
...
* start breaking up the giant channel refactor now that it's better understood - easier to merge bites
Signed-off-by: Dave Lee <dave@gray101.com >
* add concurrency and base64 back in, along with new base64 tests.
Signed-off-by: Dave Lee <dave@gray101.com >
* Automatic rename of whisper.go's Result to TranscriptResult
Signed-off-by: Dave Lee <dave@gray101.com >
* remove pkg/concurrency - significant changes coming in split 2
Signed-off-by: Dave Lee <dave@gray101.com >
* fix comments
Signed-off-by: Dave Lee <dave@gray101.com >
* add list_model service as another low-risk service to get it out of the way
Signed-off-by: Dave Lee <dave@gray101.com >
* split backend config loader into seperate file from the actual config struct. No changes yet, just reduce cognative load with smaller files of logical blocks
Signed-off-by: Dave Lee <dave@gray101.com >
* rename state.go ==> application.go
Signed-off-by: Dave Lee <dave@gray101.com >
* fix lost import?
Signed-off-by: Dave Lee <dave@gray101.com >
---------
Signed-off-by: Dave Lee <dave@gray101.com >
2024-04-29 17:42:37 +00:00
2cd4936c99
fix: security scanner warning noise: error handlers part 1 ( #2141 )
...
first group of error handlers to reduce security scanner warning noise level
Signed-off-by: Dave Lee <dave@gray101.com >
2024-04-26 10:34:31 +02:00
b664edde29
feat(rerankers): Add new backend, support jina rerankers API ( #2121 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-04-25 00:19:02 +02:00
af9e5a2d05
Revert #1963 ( #2056 )
...
* Revert "fix(fncall): fix regression introduced in #1963 (#2048 )"
This reverts commit 6b06d4e0af
.
* Revert "fix: action-tmate back to upstream, dead code removal (#2038 )"
This reverts commit fdec8a9d00
.
* Revert "feat(grpc): return consumed token count and update response accordingly (#2035 )"
This reverts commit e843d7df0e
.
* Revert "refactor: backend/service split, channel-based llm flow (#1963 )"
This reverts commit eed5706994
.
* feat(grpc): return consumed token count and update response accordingly
Fixes : #1920
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-04-17 23:33:49 +02:00
eed5706994
refactor: backend/service split, channel-based llm flow ( #1963 )
...
Refactor: channel based llm flow and services split
---------
Signed-off-by: Dave Lee <dave@gray101.com >
2024-04-13 09:45:34 +02:00
1981154f49
fix: dont commit generated files to git ( #1993 )
...
* fix: initial work towards not committing generated files to the repository
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: improve build docs
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: remove unused folder from .dockerignore and .gitignore
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: attempt to fix extra backend tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: attempt to fix other tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: more test fixes
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: fix apple tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: more extras tests fixes
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add GOBIN to PATH in docker build
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: extra tests and Dockerfile corrections
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: remove build dependency checks
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add golang protobuf compilers to tests-linux action
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: ensure protogen is run for extra backend installs
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: use newer protobuf
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: more missing protoc binaries
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: missing dependencies during docker build
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: don't install grpc compilers in the final stage if they aren't needed
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: python-grpc-tools in 22.04 repos is too old
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add a couple of extra build dependencies to Makefile
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: unbreak container rebuild functionality
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
---------
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-04-13 09:37:32 +02:00
12c0d9443e
feat: use tokenizer.apply_chat_template() in vLLM ( #1990 )
...
Use tokenizer.apply_chat_template() in vLLM
Signed-off-by: Ludovic LEROUX <ludovic@inpher.io >
2024-04-11 19:20:22 +02:00
643d85d2cc
feat(stores): Vector store backend ( #1795 )
...
Add simple vector store backend
Signed-off-by: Richard Palethorpe <io@richiejp.com >
2024-03-22 21:14:04 +01:00
20136ca8b7
feat(tts): add Elevenlabs and OpenAI TTS compatibility layer ( #1834 )
...
* feat(elevenlabs): map elevenlabs API support to TTS
This allows elevenlabs Clients to work automatically with LocalAI by
supporting the elevenlabs API.
The elevenlabs server endpoint is implemented such as it is wired to the
TTS endpoints.
Fixes: https://github.com/mudler/LocalAI/issues/1809
* feat(openai/tts): compat layer with openai tts
Fixes : #1276
* fix: adapt tts CLI
2024-03-14 23:08:34 +01:00
939411300a
Bump vLLM version + more options when loading models in vLLM ( #1782 )
...
* Bump vLLM version to 0.3.2
* Add vLLM model loading options
* Remove transformers-exllama
* Fix install exllama
2024-03-01 22:48:53 +01:00
255748bcba
MQTT Startup Refactoring Part 1: core/ packages part 1 ( #1728 )
...
This PR specifically introduces a `core` folder and moves the following packages over, without any other changes:
- `api/backend`
- `api/config`
- `api/options`
- `api/schema`
Once this is merged and we confirm there's no regressions, I can migrate over the remaining changes piece by piece to split up application startup, backend services, http, and mqtt as was the goal of the earlier PRs!
2024-02-21 01:21:19 +00:00
cb7512734d
transformers: correctly load automodels ( #1643 )
...
* backends(transformers): use AutoModel with LLM types
* examples: animagine-xl
* Add codellama examples
2024-01-26 00:13:21 +01:00
d5d82ba344
feat(grpc): backend SPI pluggable in embedding mode ( #1621 )
...
* run server
* grpc backend embedded support
* backend providable
2024-01-23 08:56:36 +01:00
e19d7226f8
feat: more embedded models, coqui fixes, add model usage and description ( #1556 )
...
* feat: add model descriptions and usage
* remove default model gallery
* models: add embeddings and tts
* docs: update table
* docs: updates
* images: cleanup pip cache after install
* images: always run apt-get clean
* ux: improve gRPC connection errors
* ux: improve some messages
* fix: fix coqui when no AudioPath is passed by
* embedded: add more models
* Add usage
* Reorder table
2024-01-08 00:37:02 +01:00
db926896bd
Revert "[Refactor]: Core/API Split" ( #1550 )
...
Revert "[Refactor]: Core/API Split (#1506 )"
This reverts commit ab7b4d5ee9
.
2024-01-05 18:04:46 +01:00
ab7b4d5ee9
[Refactor]: Core/API Split ( #1506 )
...
Refactors api folder to core, creates firm split between backend code and api frontend.
2024-01-05 15:34:56 +01:00
7641f92cde
feat(diffusers): update, add autopipeline, controlnet ( #1432 )
...
* feat(diffusers): update, add autopipeline, controlenet
* tests with AutoPipeline
* simplify logic
2023-12-13 19:20:22 +01:00
824612f1b4
feat: initial watchdog implementation ( #1341 )
...
* feat: initial watchdog implementation
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
* fiuxups
* Add more output
* wip: idletime checker
* wire idle watchdog checks
* enlarge watchdog time window
* small fixes
* Use stopmodel
* Always delete process
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-11-26 18:36:23 +01:00
548959b50f
feat: queue up requests if not running parallel requests ( #1296 )
...
Return a GRPC which handles a lock in case it is not meant to be
parallel.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-11-16 22:20:16 +01:00
ad0e30bca5
refactor: move backends into the backends directory ( #1279 )
...
* refactor: move backends into the backends directory
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* refactor: move main close to implementation for every backend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-11-13 22:40:16 +01:00
803a0ac02a
feat(llama.cpp): support lora with scale and yarn ( #1277 )
...
* feat(llama.cpp): support lora with scale
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat(llama.cpp): support yarn
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-11-11 18:40:48 +01:00
0eae727366
🔥 add LaVA support and GPT vision API, Multiple requests for llama.cpp, return JSON types ( #1254 )
...
* wip
* wip
* Make it functional
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* wip
* Small fixups
* do not inject space on role encoding, encode img at beginning of messages
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Add examples/config defaults
* Add include dir of current source dir
* cleanup
* fixes
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fixups
* Revert "fixups"
This reverts commit f1a4731cca
.
* fixes
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-11-11 13:14:59 +01:00
a28ab18987
feat(vllm): Allow to set quantization ( #1094 )
...
This particularly useful to set AWQ
**Description**
Follow up of #1015
**Notes for Reviewers**
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions:
1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR.
3. Sign your commits
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-09-22 15:52:38 +02:00
8ccf5b2044
feat(speculative-sampling): allow to specify a draft model in the model config ( #1052 )
...
**Description**
This PR fixes #1013 .
It adds `draft_model` and `n_draft` to the model YAML config in order to
load models with speculative sampling. This should be compatible as well
with grammars.
example:
```yaml
backend: llama
context_size: 1024
name: my-model-name
parameters:
model: foo-bar
n_draft: 16
draft_model: model-name
```
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-09-14 17:44:16 +02:00
dc307a1cc0
feat: add vall-e-x ( #1007 )
...
**Description**
This PR fixes #985
**Notes for Reviewers**
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions:
1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR.
3. Sign your commits
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-09-04 19:25:23 +02:00
44bc7aa3d0
feat: Allow to load lora adapters for llama.cpp ( #955 )
...
**Description**
This PR fixes #
**Notes for Reviewers**
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions:
1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR.
3. Sign your commits
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-08-25 21:58:46 +02:00
901f0709c5
Feat: rwkv improvements: ( #937 )
2023-08-22 18:48:06 +02:00
cc060a283d
fix: drop racy code, refactor and group API schema ( #931 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-08-20 14:04:45 +02:00
afdc0ebfd7
feat: add --single-active-backend to allow only one backend active at the time ( #925 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-08-19 01:49:33 +02:00
1079b18ff7
feat(diffusers): be consistent with pipelines, support also depthimg2img ( #926 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-08-18 22:06:24 +02:00
8cb1061c11
Usage Features ( #863 )
2023-08-18 21:23:14 +02:00
2bacd0180d
feat(diffusers): add img2img and clip_skip, support more kernels schedulers ( #906 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-08-17 23:38:59 +02:00
37700f2d98
feat(diffusers): add DPMSolverMultistepScheduler++, DPMSolverMultistepSchedulerSDE++, guidance_scale ( #903 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-08-16 01:11:42 +02:00
a96c3bc885
feat(diffusers): various enhancements ( #895 )
2023-08-14 23:12:00 +02:00
8c781a6a44
feat: Add Diffusers ( #874 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-08-09 08:38:51 +02:00
3c8fc37c56
feat: Add UseFastTokenizer
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2023-08-08 01:10:05 +02:00