* minor cleanup to go.mod - importing ourself?
Signed-off-by: Dave Lee <dave@gray101.com>
* figured out why we were importing ourself and fixed it
Signed-off-by: Dave Lee <dave@gray101.com>
* set pull_request_target
Signed-off-by: Dave Lee <dave@gray101.com>
---------
Signed-off-by: Dave Lee <dave@gray101.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* ci: add workflow to comment new Opened PRs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Update comment-pr.yaml
eliminate a stray ' character that was terminating the shell script by slightly rewriting the prompt
Signed-off-by: Dave <dave@gray101.com>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Dave <dave@gray101.com>
Co-authored-by: Dave <dave@gray101.com>
* ci: Do not test the full matrix on PR
Hipblas and sycl take long time to build from scratch as for now. Until
we find a way to speedup image building we are going to test these only
on master, and not for every open PR.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* ci: do not run release workflow twice
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* use a sed hack to jam a missing line in place for grpc's abseil version.
Signed-off-by: Dave Lee <dave@gray101.com>
---------
Signed-off-by: Dave Lee <dave@gray101.com>
* ci(Makefile): run tts and stablediffusion in dist
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* re-add macos-13
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* rely on detection
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* move logic to a script
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* missing some libs still
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
The binary grew up to 1.8GB quickly - rocm at least raises +800MB by
itself - so we might just want to manage the GPU libs separately.
Adds a comment to list all the libraries found so far that we are
depending on, but will likely follow up in a way to bundle these
separately.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This PR bundles further libs into the arm64 and x86_64 binaries
This can be improved by a lot - it's far from perfect, however in this PR I wanted to collect the required libs, and give a simple baseline to improve later upon. It is quite challenging to do this exercise with CI only - but it's the fastest way I see now.
I hope that after the list is initially built we can further improve this down the line and remove some of the technical debt left here to speedup things and do not get stuck in the middle of CI cycles.
In this PR:
- The x86_64 binary now bundles hipblas, nvidia and intel libraries too to avoid any dependency to be installed in the host
- Similarly, for the arm64 we now bundle all the required assets
## What's left
We should be also able to cross-compile Nvidia for arm64 - however I didn't succeed so far so I've left that open. Similarly I might have missed some libraries, but we will see with bug reports and testing around with the new binaries. I've tested on my arm64 board and I could finally start things up.
An open point still is shipping libraries for e.g. tts and stablediffusion. this is not done yet, however with the same methodology we should be able to extend support also for these two backends in the binary.
* ci: try to build for arm64
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Allow to skip hipblas on make dist
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* use arm64 cross compiler
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* correctly target go arm64
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* create a separate target
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* cross-compile grpc
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add Protobuf include dirs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* temp disable CUDA build
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* aarch64 builds: Reduce backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Even less backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Even less backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(startup): allow to load libs from extracted assets
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* makefile: set arch
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(amdgpu): try to build in single binary
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Release space from worker
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>