* chore(refactor): drop duplicated shutdown logics
- Handle locking in Shutdown and CheckModelIsLoaded in a more go-idiomatic way
- Drop duplicated code and re-organize shutdown code
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: drop leftover
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: improve logging and add missing locks
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(shutdown): do not shutdown immediately busy backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(refactor): avoid duplicate functions
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: multiplicative backoff for shutdown (#3547)
* multiplicative backoff for shutdown
Rather than always retry every two seconds, back off the shutdown attempt rate?
Signed-off-by: Dave <dave@gray101.com>
* Update loader.go
Signed-off-by: Dave <dave@gray101.com>
* add clamp of 2 minutes
Signed-off-by: Dave Lee <dave@gray101.com>
---------
Signed-off-by: Dave <dave@gray101.com>
Signed-off-by: Dave Lee <dave@gray101.com>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Dave <dave@gray101.com>
Signed-off-by: Dave Lee <dave@gray101.com>
Co-authored-by: Dave <dave@gray101.com>
Due to a previous refactor we moved the client constructor tight to the
model address, however that was just a string which we would use to
build the client each time.
With this change we make the loader to return a *Model which carries a
constructor for the client and stores the client on the first
connection.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* specify workdir when launching external backend for safety / relative paths, bump version, logs
Signed-off-by: Dave Lee <dave@gray101.com>
* sneak in a devcontainer fix
Signed-off-by: Dave Lee <dave@gray101.com>
---------
Signed-off-by: Dave Lee <dave@gray101.com>
contains simple fixes to warnings and errors, removes a broken / outdated test, runs go mod tidy, and as the actual change, centralizes base64 image handling
Signed-off-by: Dave Lee <dave@gray101.com>
* feat: allow to run parallel requests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixup
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>