Commit Graph

111 Commits

Author SHA1 Message Date
f272605b95 more robust approach
Some checks failed
Security Scan / tests (push) Has been cancelled
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
9a0982066f WIP - improve start and end of speech detection
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
30e3c47598 Improve audio detection
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
90206830c1 WIP - to drop
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
7592984b64 Use template evaluator for preparing LLM prompt in wrapped mode
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
c526f05de5 Small adaptations
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
06e438d68b WIP
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
3dd1b300e9 wip
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
ebfe8dd119 gRPC client stubs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
136fbd25f5 wip(vad)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
59531562a6 Fix lock handling
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
9273395e38 Move to debug calls
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
0318434b17 Attach context for VAD
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
9614422713 chore(vad): try to hook vad to received data from the API (WIP)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
a3fd8caaa6 feat(vad): hook vad detection
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
1796a1713d chore: extract realtime models into two categories
One is anyToAny models that requires a VAD model, and one is
wrappedModel that requires as well VAD models along others in the
pipeline.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
4f69170273 feat: correctly detect when starting the vad server
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
60c99ddc50 refactor
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
b4fea58076 Load wrapper clients
Testing with:

```yaml
name: gpt-4o
pipeline:
 tts: voice-it-riccardo_fasol-x-low
 transcription: whisper-base-q5_1
 llm: llama-3.2-1b-instruct:q4_k_m
```

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
9e965033bb chore: simplify passing options to ModelOptions
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
f45d11c734 Add model interface to sessions
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
4ca7689f31 debug
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
dcb13a7e6f WIP
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
8f507c39c0 WIP
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-14 17:13:58 +01:00
8cc2d01caa feat(ui): path prefix support via HTTP header (#4497)
Makes the web app honour the `X-Forwarded-Prefix` HTTP request header that may be sent by a reverse-proxy in order to inform the app that its public routes contain a path prefix.
For instance this allows to serve the webapp via a reverse-proxy/ingress controller under a path prefix/sub path such as e.g. `/localai/` while still being able to use the regular LocalAI routes/paths without prefix when directly connecting to the LocalAI server.

Changes:
* Add new `StripPathPrefix` middleware to strip the path prefix (provided with the `X-Forwarded-Prefix` HTTP request header) from the request path prior to matching the HTTP route.
* Add a `BaseURL` utility function to build the base URL, honouring the `X-Forwarded-Prefix` HTTP request header.
* Generate the derived base URL into the HTML (`head.html` template) as `<base/>` tag.
* Make all webapp-internal URLs (within HTML+JS) relative in order to make the browser resolve them against the `<base/>` URL specified within each HTML page's header.
* Make font URLs within the CSS files relative to the CSS file.
* Generate redirect location URLs using the new `BaseURL` function.
* Use the new `BaseURL` function to generate absolute URLs within gallery JSON responses.

Closes #3095

TL;DR:
The header-based approach allows to move the path prefix configuration concern completely to the reverse-proxy/ingress as opposed to having to align the path prefix configuration between LocalAI, the reverse-proxy and potentially other internal LocalAI clients.
The gofiber swagger handler already supports path prefixes this way, see e2d9e9916d/swagger.go (L79)

Signed-off-by: Max Goltzsche <max.goltzsche@gmail.com>
2025-01-07 17:18:21 +01:00
f943c4b803 Revert "feat: include tokens usage for streamed output" (#4336)
Revert "feat: include tokens usage for streamed output (#4282)"

This reverts commit 0d6c3a7d57.
2024-12-08 17:53:36 +01:00
cea5a0ea42 feat(template): read jinja templates from gguf files (#4332)
* Read jinja templates as fallback

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move templating out of model loader

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Test TemplateMessages

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Set role and content from transformers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tests: be more flexible

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* More jinja

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small refactoring and adaptations

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-12-08 13:50:33 +01:00
0d6c3a7d57 feat: include tokens usage for streamed output (#4282)
Use pb.Reply instead of []byte with Reply.GetMessage() in llama grpc to get the proper usage data in reply streaming mode at the last [DONE] frame

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-11-28 14:47:56 +01:00
4f1ab2366d chore(refactor): imply modelpath (#4208)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-20 18:06:35 +01:00
b1ea9318e6 feat(silero): add Silero-vad backend (#4204)
* feat(vad): add silero-vad backend (WIP)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(vad): add API endpoint

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(vad): correctly place the onnxruntime libs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(vad): hook silero-vad to binary and container builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(gRPC): register VAD Server

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(Makefile): consume ONNX_OS consistently

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(Makefile): handle macOS

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-11-20 14:48:40 +01:00
1770b92fb6 chore(api): return values from schema (#4153)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-14 14:12:29 +01:00
b425a870b0 fix(diffusers): correctly parse height and width request without parametrization (#4082)
* fix(diffusers): allow to specify width and height without enable-parameters

Let's simplify usage by not gating width and height by parameters

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: use sane defaults

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-06 08:53:02 +01:00
65c3df392c feat(tts): Implement naive response_format for tts endpoint (#4035)
Signed-off-by: n-Arno <arnaud.alcabas@gmail.com>
2024-11-02 19:13:35 +00:00
ccc7cb0287 feat(templates): use a single template for multimodals messages (#3892)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-10-22 09:34:05 +02:00
a1634b219a fix: roll out bluemonday Sanitize more widely (#3794)
* initial pass: roll out bluemonday sanitization more widely

Signed-off-by: Dave Lee <dave@gray101.com>

* add one additional sanitize - the overall modelslist used by the docs site

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2024-10-12 09:45:47 +02:00
65ca754166 Fix: listmodelservice / welcome endpoint use LOOSE_ONLY (#3791)
* fix list model service and welcome

Signed-off-by: Dave Lee <dave@gray101.com>

* comment

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2024-10-11 23:49:00 +02:00
a0f0505f0d fix(welcome): do not list model twice if we have a config (#3790)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-10-11 17:30:14 +02:00
648ffdf449 feat(multimodal): allow to template placeholders (#3728)
feat(multimodal): allow to template image placeholders

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-10-04 18:32:29 +02:00
5488fc3bc1 feat: tokenization endpoint (#3710)
endpoint to access the tokenizer

Signed-off-by: shraddhazpy <shraddha@shraddhafive.in>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Dave <dave@gray101.com>
2024-10-02 08:56:18 +02:00
0965c6cd68 feat: track internally started models by ID (#3693)
* chore(refactor): track internally started models by ID

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Just extend options, no need to copy

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Improve debugging for rerankers failures

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify model loading with rerankers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Be more consistent when generating model options

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Uncommitted code

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Make deleteProcess more idiomatic

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt CLI for sound generation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup threads definition

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Handle corner case where c.Seed is nil

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Consistently use ModelOptions

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt new code to refactoring

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Dave <dave@gray101.com>
2024-10-02 08:55:58 +02:00
307a835199 groundwork: ListModels Filtering Upgrade (#2773)
* seperate the filtering from the middleware changes

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2024-10-01 18:55:46 +00:00
f84b55d1ef feat: Add Get Token Metrics to GRPC server (#3687)
* Add Get Token Metrics to GRPC server

Signed-off-by: Siddharth More <siddimore@gmail.com>

* Expose LocalAI endpoint

Signed-off-by: Siddharth More <siddimore@gmail.com>

---------

Signed-off-by: Siddharth More <siddimore@gmail.com>
2024-10-01 14:41:20 +02:00
50a3b54e34 feat(api): add correlationID to Track Chat requests (#3668)
* Add CorrelationID to chat request

Signed-off-by: Siddharth More <siddimore@gmail.com>

* remove get_token_metrics

Signed-off-by: Siddharth More <siddimore@gmail.com>

* Add CorrelationID to proto

Signed-off-by: Siddharth More <siddimore@gmail.com>

* fix correlation method name

Signed-off-by: Siddharth More <siddimore@gmail.com>

* Update core/http/endpoints/openai/chat.go

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Siddharth More <siddimore@gmail.com>

* Update core/http/endpoints/openai/chat.go

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Siddharth More <siddimore@gmail.com>

---------

Signed-off-by: Siddharth More <siddimore@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-09-28 17:23:56 +02:00
a3d69872e3 feat(api): list loaded models in /system (#3661)
feat(api): list loaded models in /system

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-25 18:00:23 +02:00
191bc2e50a feat(api): allow to pass audios to backends (#3603)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-19 12:26:53 +02:00
fbb9facda4 feat(api): allow to pass videos to backends (#3601)
This prepares the API to receive videos as well for video understanding.

It works similarly to images, where the request should be in the form:

{
 "type": "video_url",
 "video_url": { "url": "url or base64 data" }
}

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-19 11:21:59 +02:00
cf747bcdec feat: extract output with regexes from LLMs (#3491)
* feat: extract output with regexes from LLMs

This changset adds `extract_regex` to the LLM config. It is a list of
regexes that can match output and will be used to re extract text from
the LLM output. This is particularly useful for LLMs which outputs final
results into tags.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add tests, enhance output in case of configuration error

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-13 13:27:36 +02:00
791c3ace72 feat: add endpoint to list system informations (#3449)
* feat: add endpoint to list system informations

For now, it lists the available backends, but can be expanded later on
to include more system informations (such as GPU devices detected, RAM,
threads configured, and so on so forth).

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* show also external backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* add test

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-05 20:44:30 +02:00
81ae92f017 feat: elevenlabs sound-generation api (#3355)
* initial version of elevenlabs compatible soundgeneration api and cli command

Signed-off-by: Dave Lee <dave@gray101.com>

* minor cleanup

Signed-off-by: Dave Lee <dave@gray101.com>

* restore TTS, add test

Signed-off-by: Dave Lee <dave@gray101.com>

* remove stray s

Signed-off-by: Dave Lee <dave@gray101.com>

* fix

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-08-24 00:20:28 +00:00
fbaae8528d fix(chat): re-generated uuid, created, and text on each request (#3359)
This was noticed by models returning content besides function calls.
Sadly we can't test that easily in the CI so it got unnoticed.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-22 10:56:05 +02:00