Commit Graph

277 Commits

Author SHA1 Message Date
Ettore Di Giacinto
dcbdc12cc9
Update bump_deps.yaml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-07-11 22:59:02 +02:00
Ettore Di Giacinto
03efa26ff5
ci: Do not test the full matrix on PRs (#2771)
* ci: Do not test the full matrix on PR

Hipblas and sycl take long time to build from scratch as for now. Until
we find a way to speedup image building we are going to test these only
on master, and not for every open PR.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: do not run release workflow twice

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-11 19:08:39 +02:00
Dave
fd0bc21c3e
fix abseil test issue [attempt 3] (#2769)
* use a sed hack to jam a missing line in place for grpc's abseil version.

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2024-07-11 01:40:54 +00:00
Ettore Di Giacinto
28c6daf916
ci(deps): add libgmock-dev (#2761)
* Revert "ci(grpc): disable ABSEIL tests (#2759)"

This reverts commit cbb93bd8ec.

* Revert "fix: arm builds via disabling abseil tests (#2758)"

This reverts commit 8d046de287.

* Revert "ci(arm64): fix gRPC build by adding googletest to CMakefile (#2754)"

This reverts commit 401ee553f4.

* ci(gmock): install libgmock-dev

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-10 15:23:23 +02:00
Dave
133987b1fb
feat: HF /scan endpoint (#2566)
* start by checking /scan during the checksum update

Signed-off-by: Dave Lee <dave@gray101.com>

* add back in golang side features: downloader/uri gets struct and scan function, gallery uses it, and secscan/models calls it.

Signed-off-by: Dave Lee <dave@gray101.com>

* add a param to scan specific urls - useful for debugging

Signed-off-by: Dave Lee <dave@gray101.com>

* helpful printouts

Signed-off-by: Dave Lee <dave@gray101.com>

* fix offsets

Signed-off-by: Dave Lee <dave@gray101.com>

* fix error and naming

Signed-off-by: Dave Lee <dave@gray101.com>

* expose error

Signed-off-by: Dave Lee <dave@gray101.com>

* fix json tags

Signed-off-by: Dave Lee <dave@gray101.com>

* slight wording change

Signed-off-by: Dave Lee <dave@gray101.com>

* go mod tidy - getting warnings

Signed-off-by: Dave Lee <dave@gray101.com>

* split out python to make editing easier, add some simple code  to delete contaminated entries from gallery

Signed-off-by: Dave Lee <dave@gray101.com>

* o7 to my favorite part of our old name, go-skynet

Signed-off-by: Dave Lee <dave@gray101.com>

* merge fix

Signed-off-by: Dave Lee <dave@gray101.com>

* merge fix

Signed-off-by: Dave Lee <dave@gray101.com>

* merge fix

Signed-off-by: Dave Lee <dave@gray101.com>

* address review comments

Signed-off-by: Dave Lee <dave@gray101.com>

* forgot secscan could accept multiple URL at once

Signed-off-by: Dave Lee <dave@gray101.com>

* invert naming and actually use it

Signed-off-by: Dave Lee <dave@gray101.com>

* missed cli/models.go

Signed-off-by: Dave Lee <dave@gray101.com>

* Update .github/check_and_update.py

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Dave <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
Signed-off-by: Dave <dave@gray101.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-07-10 13:18:32 +02:00
Ettore Di Giacinto
cbb93bd8ec
ci(grpc): disable ABSEIL tests (#2759)
* ci(grpc): disable ABSEIL tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* debug

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-10 13:15:33 +02:00
Dave
8d046de287
fix: arm builds via disabling abseil tests (#2758)
fix: disable abseil tests

Signed-off-by: Dave Lee <dave@gray101.com>
2024-07-10 08:43:27 +02:00
Ettore Di Giacinto
2845baecd5
fix(cuda): downgrade default version from 12.5 to 12.4 (#2707)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-09 23:13:29 +02:00
Ettore Di Giacinto
401ee553f4
ci(arm64): fix gRPC build by adding googletest to CMakefile (#2754)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-09 19:47:14 +02:00
Ettore Di Giacinto
cca881ec49
feat(p2p): Federation and AI swarms (#2723)
* Wip p2p enhancements

* get online state

* Pass-by token to show in the dashboard

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Style

* Minor fixups

* parametrize SearchID

* Refactoring

* Allow to expose/bind more services

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add federation

* Display federated mode in the WebUI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* make federated nodes visible from the WebUI

* Fix version display

* improve web page

* live page update

* visual enhancements

* enhancements

* visual enhancements

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-08 22:04:06 +02:00
Ettore Di Giacinto
c184f23621
models(gallery): add llama-3_8b_unaligned_alpha_rp_soup-i1 (#2734)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-06 15:31:00 +02:00
Ettore Di Giacinto
683c306f90
ci(Makefile): adds tts in binary releases (#2695)
* ci(Makefile): run tts and stablediffusion in dist

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* re-add macos-13

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* rely on detection

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* move logic to a script

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* missing some libs still

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-05 23:19:24 +02:00
Ettore Di Giacinto
5c135d0dec ci: change action to send twitter notification
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-04 18:50:51 +02:00
Ettore Di Giacinto
ff19b22d72 ci: change action to send twitter notification
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-04 18:28:46 +02:00
Ettore Di Giacinto
83576d7f57 ci: change action to send twitter notification
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-04 18:04:56 +02:00
Ettore Di Giacinto
9aec1b3a61 ci: try to add twitter notifications for new models
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-04 17:51:00 +02:00
Ettore Di Giacinto
6f5b6711ea
ci(notify-models): Specify the bot identity
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-07-04 12:02:04 +02:00
Ettore Di Giacinto
a637ee2278 ci: use different channel for release notifications, enhance prompt
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-04 09:22:31 +02:00
Ettore Di Giacinto
b10441a41c
ci: add pipelines for discord notifications (#2703)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-04 09:15:29 +02:00
Ettore Di Giacinto
466eb82845
ci: add latest tag for vulkan images
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-06-28 09:04:33 +02:00
Ettore Di Giacinto
7b1e792732
deps(llama.cpp): bump to latest, update build variables (#2669)
* arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* deps(llama.cpp): update build variables to follow upstream

Update build recipes with https://github.com/ggerganov/llama.cpp/pull/8006

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Disable shared libs by default in llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Disable shared libs in llama.cpp Makefile

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Disable metal embedding for now, until it is tested

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(mac): explicitly enable metal

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* debug

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix typo

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-06-27 23:10:04 +02:00
Ettore Di Giacinto
f93fe30350
ci: vulkan not ready for arm64 yet
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-06-25 18:11:57 +02:00
Ettore Di Giacinto
784ccf97ba
ci: adjust max-parallel
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-06-25 15:14:43 +02:00
Ettore Di Giacinto
e84b31935c
feat(vulkan): add vulkan support to the llama.cpp backend (#2648)
feat(vulkan): add vulkan support to llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-24 20:04:58 +02:00
Ettore Di Giacinto
04b01cd62c ci: put a cap on parallel runs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-24 18:08:09 +02:00
Sertaç Özercan
5866fc8ded
chore: fix go.mod module (#2635)
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
2024-06-23 08:24:36 +00:00
Ettore Di Giacinto
eb4cd78ca6 ci: run master jobs on self-hosted
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-23 10:15:53 +02:00
Ettore Di Giacinto
40ce71855a ci: disable max-parallelism on master 2024-06-22 23:28:09 +02:00
Ettore Di Giacinto
9c0d0afd09
ci: bump parallel jobs (#2633)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-22 23:24:46 +02:00
Rene Leonhardt
43f0688a95
feat: Upgrade to CUDA 12.5 (#2601)
Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com>
2024-06-19 17:50:49 +02:00
Ettore Di Giacinto
89a11e15e7
fix(single-binary): bundle ld.so (#2602)
* debug

* fix copy command/silly muscle memory

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* remove tmate

* Debugging

* Start binary with ld.so if present in libdir

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* small refactor

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-18 22:43:43 +02:00
Ettore Di Giacinto
7f13e3a783
docs(models): fixup top message
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-06-18 08:42:30 +02:00
Ettore Di Giacinto
4897eb0ba2
ci: pack less libs inside the binary (#2579)
The binary grew up to 1.8GB quickly - rocm at least raises +800MB by
itself - so we might just want to manage the GPU libs separately.

Adds a comment to list all the libraries found so far that we are
depending on, but will likely follow up in a way to bundle these
separately.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-16 22:10:28 +02:00
Ettore Di Giacinto
ac4a94dd44
feat(build): bundle libs for arm64 and x86 linux binaries (#2572)
This PR bundles further libs into the arm64 and x86_64 binaries

This can be improved by a lot - it's far from perfect, however in this PR I wanted to collect the required libs, and give a simple baseline to improve later upon. It is quite challenging to do this exercise with CI only - but it's the fastest way I see now. 

I hope that after the list is initially built we can further improve this down the line and remove some of the technical debt left here to speedup things and do not get stuck in the middle of CI cycles.

In this PR:

- The x86_64 binary now bundles hipblas, nvidia and intel libraries too to avoid any dependency to be installed in the host
- Similarly, for the arm64 we now bundle all the required assets

## What's left

We should be also able to cross-compile Nvidia for arm64 - however I didn't succeed so far so I've left that open. Similarly I might have missed some libraries, but we will see with bug reports and testing around with the new binaries. I've tested on my arm64 board and I could finally start things up.

An open point still is shipping libraries for e.g. tts and stablediffusion. this is not done yet, however with the same methodology we should be able to extend support also for these two backends in the binary.
2024-06-16 09:10:44 +02:00
Ettore Di Giacinto
112d0ffa45
feat(darwin): embed grpc libs (#2567)
* debug

* feat(makefile): allow to bundle libs into binary

* ci: bundle protobuf into single-binary

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(assets): correctly reference extract folder

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* bundle also abseil

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* bundle more libs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-06-14 08:51:25 +02:00
Ettore Di Giacinto
91f48b2143
docs(gallery): lazy-load images (#2557)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-13 01:05:24 +02:00
Ettore Di Giacinto
882556d4db
feat(gallery): show available models in website, allow local-ai models install to install from galleries (#2555)
* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* gen a static page instead (we force DNS redirects to it)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(gallery): install models from CLI, unify install

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Uniform graphic of model page

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Makefile: update targets

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Slightly enhance gallery view

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-13 00:47:16 +02:00
Ettore Di Giacinto
6c087ae743
feat(arm64): enable single-binary builds (#2490)
* ci: try to build for arm64

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Allow to skip hipblas on make dist

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* use arm64 cross compiler

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* correctly target go arm64

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* create a separate target

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* cross-compile grpc

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add Protobuf include dirs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* temp disable CUDA build

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* aarch64 builds: Reduce backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Even less backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Even less backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(startup): allow to load libs from extracted assets

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* makefile: set arch

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-09 15:11:37 +02:00
Dave
219078a5e0
test: e2e /reranker endpoint (#2211)
Create a simple e2e test for the /reranker api \\ go mod tidy

Signed-off-by: Dave Lee <dave@gray101.com>
2024-06-07 18:45:52 +00:00
Dave
d38e9090df
experiment: -j4 for build-linux: (#2514)
experiment: set -j4 to see if things go faster, while we wait for a proper fix from mudler

Signed-off-by: Dave Lee <dave@gray101.com>
2024-06-07 11:22:28 +02:00
Ettore Di Giacinto
b049805c9b
ci: run release build on self-hosted runners (#2505) 2024-06-06 22:16:34 -04:00
Ettore Di Giacinto
596cf76135
build(intel): bundle intel variants in single-binary (#2494)
* wip: try to build also intel variants

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add dependencies

* Select automatically intel backend

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-06 08:40:51 +02:00
Ettore Di Giacinto
17cf6c4a4d
feat(amdgpu): try to build in single binary (#2485)
* feat(amdgpu): try to build in single binary

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Release space from worker

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-05 08:44:15 +02:00
Ettore Di Giacinto
c603b95ac7
ci: pin build-time protoc (#2461)
ci: pin protoc

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-01 18:59:15 +02:00
Ettore Di Giacinto
10c64dbb55
models(gallery): add mopeymule (#2449)
* models(gallery): add mopeymule

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: try to fix workflow

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-31 18:08:39 +02:00
Ettore Di Giacinto
2bbc52fcc8
feat(build): add arm64 core containers (#2421)
ci: add arm64 container images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-28 10:34:59 +02:00
Ettore Di Giacinto
d075dc44dd
ci: push test images when building PRs (#2424)
ci: try to push image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-27 22:07:35 +02:00
Ettore Di Giacinto
be8ffbdfcf
ci(grpc-cache): also arm64 (#2423)
grpc-cache: also arm64

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-27 17:23:34 +02:00
Sertaç Özercan
29615576fb
ci: fix sd release (#2400)
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
2024-05-25 09:33:50 +02:00
Ettore Di Giacinto
e0187c2a1a ci: do not tag latest on AIO automatically
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-24 09:41:13 +02:00
Sertaç Özercan
7efa8e75d4
fix: stablediffusion binary (#2385)
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
2024-05-23 08:34:37 +02:00
Ettore Di Giacinto
7551369abe
Update checksum_checker.sh
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-23 08:33:58 +02:00
Ettore Di Giacinto
21a12c2cdd
ci(checksum_checker): do get sha from hf API when available (#2380)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-22 23:51:02 +02:00
Ettore Di Giacinto
371d0cc1f7
ci: generate specific image for intel builds (#2374)
ci: fix intel images until are fixed upstream

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-22 23:35:39 +02:00
Ettore Di Giacinto
1a3dedece0
dependencies(grpcio): bump to fix CI issues (#2362)
feat(grpcio): bump to fix CI issues

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-21 14:33:47 +02:00
Ettore Di Giacinto
fdb45153fe
feat(llama.cpp): Totally decentralized, private, distributed, p2p inference (#2343)
* feat(llama.cpp): Enable decentralized, distributed inference

As https://github.com/mudler/LocalAI/pull/2324 introduced distributed inferencing thanks to
@rgerganov implementation in https://github.com/ggerganov/llama.cpp/pull/6829 in upstream llama.cpp, now
it is possible to distribute the workload to remote llama.cpp gRPC server.

This changeset now uses mudler/edgevpn to establish a secure, distributed network between the nodes using a shared token.
The token is generated automatically when starting the server with the `--p2p` flag, and can be used by starting the workers
with `local-ai worker p2p-llama-cpp-rpc` by passing the token via environment variable (TOKEN) or with args (--token).

As per how mudler/edgevpn works, a network is established between the server and the workers with dht and mdns discovery protocols,
the llama.cpp rpc server is automatically started and exposed to the underlying p2p network so the API server can connect on.

When the HTTP server is started, it will discover the workers in the network and automatically create the port-forwards to the service locally.
Then llama.cpp is configured to use the services.

This feature is behind the "p2p" GO_FLAGS

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* go mod tidy

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: add p2p tag

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* better message

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-20 19:17:59 +02:00
Ettore Di Giacinto
8ad669339e
add openvoice backend (#2334)
Wip openvoice
2024-05-19 16:27:08 +02:00
Ettore Di Giacinto
c89271b2e4
feat(llama.cpp): add distributed llama.cpp inferencing (#2324)
* feat(llama.cpp): support distributed llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: let tweak how chat messages are merged together

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactor

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Makefile: register to ALL_GRPC_BACKENDS

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring, allow disable auto-detection of backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* minor fixups

Signed-off-by: mudler <mudler@localai.io>

* feat: add cmd to start rpc-server from llama.cpp

Signed-off-by: mudler <mudler@localai.io>

* ci: add ccache

Signed-off-by: mudler <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: mudler <mudler@localai.io>
2024-05-15 01:17:02 +02:00
Sertaç Özercan
a670318a9f
feat: auto select llama-cpp cuda runtime (#2306)
* auto select cpu variant

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>

* remove cuda target for now

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>

* fix metal

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>

* fix path

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>

* cuda

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>

* auto select cuda

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>

* update test

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>

* select CUDA backend only if present

Signed-off-by: mudler <mudler@localai.io>

* ci: keep cuda bin in path

Signed-off-by: mudler <mudler@localai.io>

* Makefile: make dist now builds also cuda

Signed-off-by: mudler <mudler@localai.io>

* Keep pushing fallback in case auto-flagset/nvidia fails

There could be other reasons for which the default binary may fail. For example we might have detected an Nvidia GPU,
however the user might not have the drivers/cuda libraries installed in the system, and so it would fail to start.

We keep the fallback of llama.cpp at the end of the llama.cpp backends to try to fallback loading in case things go wrong

Signed-off-by: mudler <mudler@localai.io>

* Do not build cuda on MacOS

Signed-off-by: mudler <mudler@localai.io>

* cleanup

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>

* Apply suggestions from code review

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: mudler <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: mudler <mudler@localai.io>
2024-05-14 19:40:18 +02:00
cryptk
28a421cb1d
feat: migrate python backends from conda to uv (#2215)
* feat: migrate diffusers backend from conda to uv

  - replace conda with UV for diffusers install (prototype for all
    extras backends)
  - add ability to build docker with one/some/all extras backends
    instead of all or nothing

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate autogtpq bark coqui from conda to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: convert exllama over to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate exllama2 to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate mamba to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate parler to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate petals to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: fix tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate rerankers to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate sentencetransformers to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: install uv for tests-linux

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: make sure file exists before installing on intel images

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate transformers backend to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate transformers-musicgen to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate vall-e-x to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate vllm to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add uv install to the rest of test-extra.yml

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: adjust file perms on all install/run/test scripts

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add missing acclerate dependencies

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add some more missing dependencies to python backends

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: parler tests venv py dir fix

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: correct filename for transformers-musicgen tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: adjust the pwd for valle tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: cleanup and optimization work for uv migration

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add setuptools to requirements-install for mamba

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: more size optimization work

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: make installs and tests more consistent, cleanup some deps

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: cleanup

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: mamba backend is cublas only

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: uncomment lines in makefile

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-05-10 15:08:08 +02:00
Ettore Di Giacinto
650ae620c5
ci: get latest git version
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-09 11:33:16 +02:00
Ettore Di Giacinto
6a209cbef6
ci: get file name correctly in checksum_checker.sh
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-09 10:57:23 +02:00
Ettore Di Giacinto
9786bb826d
ci: try to fix checksum_checker.sh
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-09 09:34:07 +02:00
Ettore Di Giacinto
9b4c6f348a
Update checksum_checker.yaml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-09 00:57:22 +02:00
Ettore Di Giacinto
cb6ddb21ec
Update checksum_checker.yaml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-09 00:55:48 +02:00
Ettore Di Giacinto
0baacca605
Update checksum_checker.yaml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-09 00:54:35 +02:00
Ettore Di Giacinto
222d714ec7
Update checksum_checker.yaml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-09 00:51:57 +02:00
Ettore Di Giacinto
fd2d89d37b
Update checksum_checker.sh
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-09 00:43:16 +02:00
Ettore Di Giacinto
6440b608dc
Update checksum_checker.yaml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-09 00:42:48 +02:00
Ettore Di Giacinto
1937118eab
Update checksum_checker.yaml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-09 00:34:56 +02:00
Ettore Di Giacinto
bc272d1e4b
ci: add checksum checker pipeline (#2274)
Signed-off-by: mudler <mudler@localai.io>
2024-05-09 00:31:27 +02:00
Ettore Di Giacinto
c5798500cb
feat(single-build): generate single binaries for releases (#2246)
* feat(single-build): generate single binaries for releases

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* drop old targets

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-05 17:20:51 +02:00
cryptk
a0aa5d01a1
feat: update ROCM and use smaller image (#2196)
* feat: update ROCM and use smaller image

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add call to ldconfig to fix AMDs broken library packages

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-05-03 18:46:49 +02:00
cryptk
f7aabf1b50
fix: bring everything onto the same GRPC version to fix tests (#2199)
fix: more places where we are installing grpc that need a version specified
fix: attempt to fix metal tests
fix: metal/brew is forcing an update, they don't have 1.58 available anymore

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-30 19:12:15 +00:00
dependabot[bot]
53c3842bc2
build(deps): bump dependabot/fetch-metadata from 2.0.0 to 2.1.0 (#2186)
Bumps [dependabot/fetch-metadata](https://github.com/dependabot/fetch-metadata) from 2.0.0 to 2.1.0.
- [Release notes](https://github.com/dependabot/fetch-metadata/releases)
- [Commits](https://github.com/dependabot/fetch-metadata/compare/v2.0.0...v2.1.0)

---
updated-dependencies:
- dependency-name: dependabot/fetch-metadata
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-29 21:12:37 +00:00
Dave
982dc6a2bd
fix: github bump_docs.sh regex to drop emoji and other text (#2180)
fix: bump_docs regex

Signed-off-by: Dave Lee <dave@gray101.com>
2024-04-29 03:55:29 +00:00
cryptk
987b7ad42d
feat: only keep the build artifacts from the grpc build (#2172)
* feat: only keep the build artifacts from the grpc build

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: remove separate Cache GRPC build step

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: remove docker inspect step, it is leftover from previous debugging

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-28 19:24:16 +00:00
Ettore Di Giacinto
7e6bf6e7a1
ci: add auto-label rule for gallery in labeler.yml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-27 19:52:26 +02:00
cryptk
9fc0135991
feat: cleanup Dockerfile and make final image a little smaller (#2146)
* feat: cleanup Dockerfile and make final image a little smaller

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add build-essential to final stage

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: more GRPC cache misses

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: correct for another cause of GRPC cache misses

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: generate new GRPC cache automatically if needed

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: use new GRPC_MAKEFLAGS build arg in GRPC cache generation

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-27 19:48:20 +02:00
fakezeta
c9451cb604
Bump oneapi-basekit, optimum and openvino (#2139)
* Bump oneapi-basekit, optimum and openvino

* Changed PERFORMANCE HINT to CUMULATIVE_THROUGHPUT

Minor latency change for first token but about 10-15% speedup on token generation.
2024-04-26 16:20:43 +02:00
Ettore Di Giacinto
5d170e9264
Update yaml-check.yml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-25 16:05:02 +02:00
Ettore Di Giacinto
1b0a64aa46
Update yaml-check.yml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-25 15:57:06 +02:00
Ettore Di Giacinto
aa8e1c63d5
Create yaml-check.yml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-25 15:52:52 +02:00
Ettore Di Giacinto
60690c9fc4 ci: add swagger pipeline
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-25 15:11:01 +02:00
Ettore Di Giacinto
b664edde29
feat(rerankers): Add new backend, support jina rerankers API (#2121)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-25 00:19:02 +02:00
Dave
228bc4903f
fix: action-tmate detached (#2092)
connect-timeout-seconds works best with `detached: true`

Signed-off-by: Dave <dave@gray101.com>
2024-04-21 22:39:17 +02:00
Dave
1038f7469c
fix: action-tmate: use connect-timeout-sections and limit-access-to-actor (#2083)
fix for action-tmate: connect-timeout-sections and limit-access-to-actor

Signed-off-by: Dave Lee <dave@gray101.com>
2024-04-20 08:42:02 +00:00
cryptk
852316c5a6
fix: move the GRPC cache generation workflow into it's own concurrency group (#2071)
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-18 20:52:34 -04:00
cryptk
13012cfa70
feat: better control of GRPC docker cache (#2070)
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-18 16:19:36 -04:00
Ettore Di Giacinto
af9e5a2d05
Revert #1963 (#2056)
* Revert "fix(fncall): fix regression introduced in #1963 (#2048)"

This reverts commit 6b06d4e0af.

* Revert "fix: action-tmate back to upstream, dead code removal (#2038)"

This reverts commit fdec8a9d00.

* Revert "feat(grpc): return consumed token count and update response accordingly (#2035)"

This reverts commit e843d7df0e.

* Revert "refactor: backend/service split, channel-based llm flow (#1963)"

This reverts commit eed5706994.

* feat(grpc): return consumed token count and update response accordingly

Fixes: #1920

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-17 23:33:49 +02:00
Dave
fdec8a9d00
fix: action-tmate back to upstream, dead code removal (#2038)
cleanup: upstream action-tmate has taken my PR, drop master reference. Also remove dead code from api.go

Signed-off-by: Dave Lee <dave@gray101.com>
2024-04-16 01:46:36 +00:00
dependabot[bot]
320d8a48d9
build(deps): bump github/codeql-action from 2 to 3 (#2041)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2 to 3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/v2...v3)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-15 22:02:44 +00:00
dependabot[bot]
46609e936e
build(deps): bump dependabot/fetch-metadata from 1.3.4 to 2.0.0 (#2040)
Bumps [dependabot/fetch-metadata](https://github.com/dependabot/fetch-metadata) from 1.3.4 to 2.0.0.
- [Release notes](https://github.com/dependabot/fetch-metadata/releases)
- [Commits](https://github.com/dependabot/fetch-metadata/compare/v1.3.4...v2.0.0)

---
updated-dependencies:
- dependency-name: dependabot/fetch-metadata
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-15 21:37:06 +00:00
dependabot[bot]
b72c6cc9fc
build(deps): bump softprops/action-gh-release from 1 to 2 (#2039)
Bumps [softprops/action-gh-release](https://github.com/softprops/action-gh-release) from 1 to 2.
- [Release notes](https://github.com/softprops/action-gh-release/releases)
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](https://github.com/softprops/action-gh-release/compare/v1...v2)

---
updated-dependencies:
- dependency-name: softprops/action-gh-release
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-15 20:52:39 +00:00
Dave
d5699dbf4f
fix - correct checkout versions (#2029)
minor fix - bump some checkout@v3 to checkout@v4 to match and clean up warnings

Signed-off-by: Dave Lee <dave@gray101.com>
2024-04-13 19:01:17 +02:00
Ettore Di Giacinto
0fdff26924
feat(parler-tts): Add new backend (#2027)
* feat(parler-tts): Add new backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(parler-tts): try downgrade protobuf

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(parler-tts): add parler conda env

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Revert "feat(parler-tts): try downgrade protobuf"

This reverts commit bd5941d5cfc00676b45a99f71debf3c34249cf3c.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* deps: add grpc

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: try to gen proto with same environment

* workaround

* Revert "fix: try to gen proto with same environment"

This reverts commit 998c745e2f.

* Workaround fixup

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Dave <dave@gray101.com>
2024-04-13 18:59:21 +02:00
Ettore Di Giacinto
b91820b7f8
Update localaibot_automerge.yml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-13 13:46:07 +02:00
Ettore Di Giacinto
4e74560649
ci: fix release pipeline missing dependencies (#2025) 2024-04-13 13:30:40 +02:00
Ettore Di Giacinto
95244ed6e7
Update localaibot_automerge.yml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-13 10:03:15 +02:00
Ettore Di Giacinto
f1f39eea3f
Create localaibot_automerge.yml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-13 09:47:33 +02:00