Commit Graph

147 Commits

Author SHA1 Message Date
58ff47de26 feat(bark-cpp): add new bark.cpp backend (#4287)
* feat(bark-cpp): add new bark.cpp backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* build on linux only for now

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* track bark.cpp in CI bumps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop old entries from bumper

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* No need to test rwkv specifically, now part of llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-28 22:16:44 +01:00
0d6c3a7d57 feat: include tokens usage for streamed output (#4282)
Use pb.Reply instead of []byte with Reply.GetMessage() in llama grpc to get the proper usage data in reply streaming mode at the last [DONE] frame

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-11-28 14:47:56 +01:00
3c3050f68e feat(backends): Drop bert.cpp (#4272)
* feat(backends): Drop bert.cpp

use llama.cpp 3.2 as a drop-in replacement for bert.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): make test more robust

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-27 16:34:28 +01:00
4f1ab2366d chore(refactor): imply modelpath (#4208)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-20 18:06:35 +01:00
b1ea9318e6 feat(silero): add Silero-vad backend (#4204)
* feat(vad): add silero-vad backend (WIP)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(vad): add API endpoint

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(vad): correctly place the onnxruntime libs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(vad): hook silero-vad to binary and container builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(gRPC): register VAD Server

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(Makefile): consume ONNX_OS consistently

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(Makefile): handle macOS

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-11-20 14:48:40 +01:00
de148cb2ad feat: add WebUI API token authorization (#4197)
* return 401 instead of 403, provide www-authenticate header, redirect to the login page, add cookie token support

* set cookies completely through js in auth page
2024-11-19 18:43:02 +01:00
1770b92fb6 chore(api): return values from schema (#4153)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-14 14:12:29 +01:00
2c041a2077 feat(ui): move model detailed info to a modal (#4086)
* feat(ui): move model detailed info to a modal

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: add static asset

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-06 18:25:59 +01:00
b425a870b0 fix(diffusers): correctly parse height and width request without parametrization (#4082)
* fix(diffusers): allow to specify width and height without enable-parameters

Let's simplify usage by not gating width and height by parameters

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: use sane defaults

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-06 08:53:02 +01:00
65c3df392c feat(tts): Implement naive response_format for tts endpoint (#4035)
Signed-off-by: n-Arno <arnaud.alcabas@gmail.com>
2024-11-02 19:13:35 +00:00
8f7045cfa6 chore(tests): bump timeouts (#4024)
To avoid flaky runs

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-10-31 15:40:43 +01:00
88edb1e2af chore(tests): expand timeout (#4019)
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-10-30 15:34:44 +01:00
546dce68a6 chore: change url to github repository (#3972)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-10-26 14:50:18 +02:00
8737a65760 feat: allow to disable '/metrics' endpoints for local stats (#3945)
Seem the "/metrics" endpoint that is source of confusion as people tends
to believe we collect telemetry data just because we import
"opentelemetry", however it is still a good idea to allow to disable
even local metrics if not really required.

See also: https://github.com/mudler/LocalAI/issues/3942

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-10-23 15:34:32 +02:00
ccc7cb0287 feat(templates): use a single template for multimodals messages (#3892)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-10-22 09:34:05 +02:00
a1634b219a fix: roll out bluemonday Sanitize more widely (#3794)
* initial pass: roll out bluemonday sanitization more widely

Signed-off-by: Dave Lee <dave@gray101.com>

* add one additional sanitize - the overall modelslist used by the docs site

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2024-10-12 09:45:47 +02:00
65ca754166 Fix: listmodelservice / welcome endpoint use LOOSE_ONLY (#3791)
* fix list model service and welcome

Signed-off-by: Dave Lee <dave@gray101.com>

* comment

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2024-10-11 23:49:00 +02:00
a0f0505f0d fix(welcome): do not list model twice if we have a config (#3790)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-10-11 17:30:14 +02:00
d9b63fae7c chore(tests): improve rwkv tests and consume TEST_FLAKES (#3765)
chores(tests): improve rwkv tests and consume TEST_FLAKES

consistently use TEST_FLAKES and reduce flakiness of rwkv tests by being
case insensitive

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-10-08 09:24:19 +02:00
648ffdf449 feat(multimodal): allow to template placeholders (#3728)
feat(multimodal): allow to template image placeholders

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-10-04 18:32:29 +02:00
e5586e8781 chore: get model also from query (#3716)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-10-02 20:20:50 +02:00
5488fc3bc1 feat: tokenization endpoint (#3710)
endpoint to access the tokenizer

Signed-off-by: shraddhazpy <shraddha@shraddhafive.in>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Dave <dave@gray101.com>
2024-10-02 08:56:18 +02:00
0965c6cd68 feat: track internally started models by ID (#3693)
* chore(refactor): track internally started models by ID

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Just extend options, no need to copy

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Improve debugging for rerankers failures

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify model loading with rerankers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Be more consistent when generating model options

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Uncommitted code

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Make deleteProcess more idiomatic

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt CLI for sound generation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup threads definition

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Handle corner case where c.Seed is nil

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Consistently use ModelOptions

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt new code to refactoring

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Dave <dave@gray101.com>
2024-10-02 08:55:58 +02:00
307a835199 groundwork: ListModels Filtering Upgrade (#2773)
* seperate the filtering from the middleware changes

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2024-10-01 18:55:46 +00:00
f84b55d1ef feat: Add Get Token Metrics to GRPC server (#3687)
* Add Get Token Metrics to GRPC server

Signed-off-by: Siddharth More <siddimore@gmail.com>

* Expose LocalAI endpoint

Signed-off-by: Siddharth More <siddimore@gmail.com>

---------

Signed-off-by: Siddharth More <siddimore@gmail.com>
2024-10-01 14:41:20 +02:00
50a3b54e34 feat(api): add correlationID to Track Chat requests (#3668)
* Add CorrelationID to chat request

Signed-off-by: Siddharth More <siddimore@gmail.com>

* remove get_token_metrics

Signed-off-by: Siddharth More <siddimore@gmail.com>

* Add CorrelationID to proto

Signed-off-by: Siddharth More <siddimore@gmail.com>

* fix correlation method name

Signed-off-by: Siddharth More <siddimore@gmail.com>

* Update core/http/endpoints/openai/chat.go

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Siddharth More <siddimore@gmail.com>

* Update core/http/endpoints/openai/chat.go

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Siddharth More <siddimore@gmail.com>

---------

Signed-off-by: Siddharth More <siddimore@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-09-28 17:23:56 +02:00
a3d69872e3 feat(api): list loaded models in /system (#3661)
feat(api): list loaded models in /system

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-25 18:00:23 +02:00
0893d3cbbe fix(health): do not require auth for /healthz and /readyz (#3656)
* fix(health): do not require auth for /healthz and /readyz

Fixes: #3655

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Comment so I don’t forget

Adding a reminder here...

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Dave <dave@gray101.com>
2024-09-24 18:25:59 +00:00
90cacb9692 test: preliminary tests and merge fix for authv2 (#3584)
* add api key to existing app tests, add preliminary auth test

Signed-off-by: Dave Lee <dave@gray101.com>

* small fix, run test

Signed-off-by: Dave Lee <dave@gray101.com>

* status on non-opaque

Signed-off-by: Dave Lee <dave@gray101.com>

* tweak auth error

Signed-off-by: Dave Lee <dave@gray101.com>

* exp

Signed-off-by: Dave Lee <dave@gray101.com>

* quick fix on real laptop

Signed-off-by: Dave Lee <dave@gray101.com>

* add downloader version that allows providing an auth header

Signed-off-by: Dave Lee <dave@gray101.com>

* stash some devcontainer fixes during testing

Signed-off-by: Dave Lee <dave@gray101.com>

* s2

Signed-off-by: Dave Lee <dave@gray101.com>

* s

Signed-off-by: Dave Lee <dave@gray101.com>

* done with experiment

Signed-off-by: Dave Lee <dave@gray101.com>

* done with experiment

Signed-off-by: Dave Lee <dave@gray101.com>

* after merge fix

Signed-off-by: Dave Lee <dave@gray101.com>

* rename and fix

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-09-24 09:32:48 +02:00
191bc2e50a feat(api): allow to pass audios to backends (#3603)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-19 12:26:53 +02:00
fbb9facda4 feat(api): allow to pass videos to backends (#3601)
This prepares the API to receive videos as well for video understanding.

It works similarly to images, where the request should be in the form:

{
 "type": "video_url",
 "video_url": { "url": "url or base64 data" }
}

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-19 11:21:59 +02:00
a53392f919 chore(refactor): drop duplicated shutdown logics (#3589)
* chore(refactor): drop duplicated shutdown logics

- Handle locking in Shutdown and CheckModelIsLoaded in a more go-idiomatic way
- Drop duplicated code and re-organize shutdown code

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: drop leftover

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: improve logging and add missing locks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-17 16:51:40 +02:00
db1159b651 feat: auth v2 - supersedes #2894 (#3476)
feat: auth v2 - supercedes #2894, metrics to follow later

Signed-off-by: Dave Lee <dave@gray101.com>
2024-09-16 23:29:07 -04:00
cf747bcdec feat: extract output with regexes from LLMs (#3491)
* feat: extract output with regexes from LLMs

This changset adds `extract_regex` to the LLM config. It is a list of
regexes that can match output and will be used to re extract text from
the LLM output. This is particularly useful for LLMs which outputs final
results into tags.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add tests, enhance output in case of configuration error

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-13 13:27:36 +02:00
791c3ace72 feat: add endpoint to list system informations (#3449)
* feat: add endpoint to list system informations

For now, it lists the available backends, but can be expanded later on
to include more system informations (such as GPU devices detected, RAM,
threads configured, and so on so forth).

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* show also external backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* add test

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-05 20:44:30 +02:00
81ae92f017 feat: elevenlabs sound-generation api (#3355)
* initial version of elevenlabs compatible soundgeneration api and cli command

Signed-off-by: Dave Lee <dave@gray101.com>

* minor cleanup

Signed-off-by: Dave Lee <dave@gray101.com>

* restore TTS, add test

Signed-off-by: Dave Lee <dave@gray101.com>

* remove stray s

Signed-off-by: Dave Lee <dave@gray101.com>

* fix

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-08-24 00:20:28 +00:00
fbaae8528d fix(chat): re-generated uuid, created, and text on each request (#3359)
This was noticed by models returning content besides function calls.
Sadly we can't test that easily in the CI so it got unnoticed.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-22 10:56:05 +02:00
b510352393 chore(anime.js): drop unused (#3351)
* fix(anime.js): correctly set the static path

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop anime.js (unused)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-21 13:10:09 +02:00
0c84c7b1cc chore(ux): allow to create and drag dots in the animation (#3287)
Make the animation more interactive!

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-19 20:40:55 +02:00
73c9b3598d chore(p2p): make commands easier to copy-paste (#3273)
chore(p2p): make box easier to copy-paste

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-19 19:58:17 +02:00
13cb7960bd chore(ux): add animated header with anime.js in p2p sections (#3271)
feat(p2p): add animated header with anime.js

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-19 18:05:02 +02:00
1dbb3b8abc fix(gallery): be consistent and disable UI routes as well (#3262)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-18 09:26:29 +02:00
7278bf3de8 chore: allow to disable gallery endpoints, improve p2p connection handling (#3256)
* Add more debug messages

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: allow to disable gallery endpoints

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* improve p2p messaging

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* improve error handling

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Make sure to close the listening socket when context is exhausted

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-17 08:28:52 +02:00
3457acc48b chore(explorer): add join instructions (#3255)
* feat(explorer): give CLI instructions to join federated clusters

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* debug message

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-16 19:34:36 +02:00
c50e0edcb8 feat(gallery): lazy load images (#3246)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-14 12:53:42 +02:00
d6c4e751f2 feat(explorer): visual improvements (#3247)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-14 12:53:29 +02:00
9729d2ae37 feat(explorer): make possible to run sync in a separate process (#3224)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-12 19:25:44 +02:00
9e3e892ac7 feat(p2p): add network explorer and community pools (#3125)
* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Wire up a simple explorer DB

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactor: group services id so can be identified easily in the ledger table

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(discovery): discovery service now gather worker informations correctly

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(explorer): display network token

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(explorer): display form to add new networks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(explorer): stop from overwriting networks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(explorer): display only networks with active workers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(explorer): list only clusters in a network if it has online workers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* remove invalid and inactive networks

if networks have no workers delete them from the database, similarly,
if invalid.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: add workflow to deploy new explorer versions automatically

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* build-api: build with p2p tag

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Allow to specify a connection timeout

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* logging

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Better p2p defaults

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Set loglevel

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix dht enable

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Default to info for loglevel

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add navbar

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Slightly improve rendering

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Allow to copy the token easily

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-09 20:12:01 +02:00
8814b31805 chore: drop gpt4all.cpp (#3106)
chore: drop gpt4all

gpt4all is already supported in llama.cpp - the backend was kept for
keeping compatibility with old gpt4all models (prior to gguf format).

It is good time now to clean up and remove it to slim the compilation
process.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-07 23:35:55 +02:00
36e185ba63 feat(p2p): allow to run multiple clusters in the same p2p network (#3128)
feat(p2p): allow to run multiple clusters in the same network

Allow to specify a network ID via CLI which allows to run multiple
clusters, logically separated within the same network (by using the same
shared token).

Note: This segregation is not "secure" by any means, anyone having the
network token can see the services available in all the network,
however, this provides a way to separate the inference endpoints.

This allows for instance to have a node which is both federated and
having attached a set of llama.cpp workers.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-07 23:35:44 +02:00