Commit Graph

14 Commits

Author SHA1 Message Date
Ettore Di Giacinto
ddd21f1644
feat: Use ubuntu as base for container images, drop deprecated ggml-transformers backends (#1689)
* cleanup backends

* switch image to ubuntu 22.04

* adapt commands for ubuntu

* transformers cleanup

* no contrib on ubuntu

* Change test model to gguf

* ci: disable bark tests (too cpu-intensive)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* cleanup

* refinements

* use intel base image

* Makefile: Add docker targets

* Change test model

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-02-08 20:12:51 +01:00
Ettore Di Giacinto
db926896bd
Revert "[Refactor]: Core/API Split" (#1550)
Revert "[Refactor]: Core/API Split (#1506)"

This reverts commit ab7b4d5ee9.
2024-01-05 18:04:46 +01:00
Dave
ab7b4d5ee9
[Refactor]: Core/API Split (#1506)
Refactors api folder to core, creates firm split between backend code and api frontend.
2024-01-05 15:34:56 +01:00
Ettore Di Giacinto
432513c3ba
ci: add GPU tests (#1095)
* ci: test GPU

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: show logs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Debug

* debug

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* split extra/core images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* split extra/core images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* consider runner host dir

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-19 13:50:40 +02:00
Dave
c6bf67f446
feat(llama2): add template for chat messages (#782)
Co-authored-by: Aman Karmani <aman@tmm1.net>

Lays some of the groundwork for LLAMA2 compatibility as well as other future models with complex prompting schemes.

Started small refactoring in pkg/model/loader.go regarding template loading. Currently still a part of ModelLoader, but should be easy to add template loading for situations other than overall prompt templates and the new chat-specific per-message templates
Adds support for new chat-endpoint-specific, per-message templates as an alternative to the existing Role: XYZ sprintf method.
Includes a temporary prompt template as an example, since I have a few questions before we merge in the model-gallery side changes (see )
Minor debug logging changes.
2023-07-22 11:31:39 -04:00
Ettore Di Giacinto
94916749c5 feat: add external grpc and model autoloading 2023-07-20 22:10:12 +02:00
Ettore Di Giacinto
9decd0813c
feat: update go-gpt2 (#359)
Signed-off-by: mudler <mudler@mocaccino.org>
2023-05-23 21:47:47 +02:00
Ettore Di Giacinto
cc9aa9eb3f
feat: add /models/apply endpoint to prepare models (#286) 2023-05-18 15:59:03 +02:00
Ettore Di Giacinto
a035de2fdd
tests: add rwkv (#261) 2023-05-15 08:15:01 +02:00
Ettore Di Giacinto
2488c445b6
feat: bert.cpp token embeddings (#241) 2023-05-12 17:16:49 +02:00
Ettore Di Giacinto
b4241d0a0d
tests: enable whisper (#239) 2023-05-12 14:10:18 +02:00
mudler
009ee47fe2 Don't allow 0 as thread count 2023-05-05 22:51:20 +02:00
mudler
d094381e5d ci: lower fixtures spec 2023-05-05 21:28:38 +02:00
Ettore Di Giacinto
c806eae0de
feat: config files and SSE (#83)
Signed-off-by: mudler <mudler@mocaccino.org>
Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>
Co-authored-by: Tyler Gillson <tyler.gillson@gmail.com>
2023-04-26 21:18:18 -07:00