* Read jinja templates as fallback
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Move templating out of model loader
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Test TemplateMessages
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Set role and content from transformers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Tests: be more flexible
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* More jinja
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Small refactoring and adaptations
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Seem the "/metrics" endpoint that is source of confusion as people tends
to believe we collect telemetry data just because we import
"opentelemetry", however it is still a good idea to allow to disable
even local metrics if not really required.
See also: https://github.com/mudler/LocalAI/issues/3942
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
endpoint to access the tokenizer
Signed-off-by: shraddhazpy <shraddha@shraddhafive.in>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Dave <dave@gray101.com>
* fix(health): do not require auth for /healthz and /readyz
Fixes: #3655
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Comment so I don’t forget
Adding a reminder here...
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Dave <dave@gray101.com>
* chore(refactor): drop duplicated shutdown logics
- Handle locking in Shutdown and CheckModelIsLoaded in a more go-idiomatic way
- Drop duplicated code and re-organize shutdown code
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: drop leftover
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: improve logging and add missing locks
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: add endpoint to list system informations
For now, it lists the available backends, but can be expanded later on
to include more system informations (such as GPU devices detected, RAM,
threads configured, and so on so forth).
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* show also external backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* add test
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add more debug messages
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: allow to disable gallery endpoints
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* improve p2p messaging
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* improve error handling
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Make sure to close the listening socket when context is exhausted
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
feat(p2p): allow to run multiple clusters in the same network
Allow to specify a network ID via CLI which allows to run multiple
clusters, logically separated within the same network (by using the same
shared token).
Note: This segregation is not "secure" by any means, anyone having the
network token can see the services available in all the network,
however, this provides a way to separate the inference endpoints.
This allows for instance to have a node which is both federated and
having attached a set of llama.cpp workers.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Wip p2p enhancements
* get online state
* Pass-by token to show in the dashboard
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Style
* Minor fixups
* parametrize SearchID
* Refactoring
* Allow to expose/bind more services
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add federation
* Display federated mode in the WebUI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Small fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* make federated nodes visible from the WebUI
* Fix version display
* improve web page
* live page update
* visual enhancements
* enhancements
* visual enhancements
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* start breaking up the giant channel refactor now that it's better understood - easier to merge bites
Signed-off-by: Dave Lee <dave@gray101.com>
* add concurrency and base64 back in, along with new base64 tests.
Signed-off-by: Dave Lee <dave@gray101.com>
* Automatic rename of whisper.go's Result to TranscriptResult
Signed-off-by: Dave Lee <dave@gray101.com>
* remove pkg/concurrency - significant changes coming in split 2
Signed-off-by: Dave Lee <dave@gray101.com>
* fix comments
Signed-off-by: Dave Lee <dave@gray101.com>
* add list_model service as another low-risk service to get it out of the way
Signed-off-by: Dave Lee <dave@gray101.com>
* split backend config loader into seperate file from the actual config struct. No changes yet, just reduce cognative load with smaller files of logical blocks
Signed-off-by: Dave Lee <dave@gray101.com>
* rename state.go ==> application.go
Signed-off-by: Dave Lee <dave@gray101.com>
* fix lost import?
Signed-off-by: Dave Lee <dave@gray101.com>
---------
Signed-off-by: Dave Lee <dave@gray101.com>
* feat(gallery): op now supports deletion of models
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Wire things with WebUI(WIP)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* minor improvements
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>