fix(hipblas): do not push all variants to hipblas builds (#3630)

Like with CUDA builds, we don't need all the variants when we are compiling against the accelerated variants - in this way we save space and we avoid to exceed embedFS golang size limits. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-05-31 22:40:45 +00:00 · 2024-09-23 11:49:07 +02:00 · 2024-09-23 11:49:07 +02:00 · 51cba89682
commit 51cba89682
parent 3e8e71f8b6
1 changed files with 4 additions and 4 deletions
--- a/8
+++ b/8
@ -297,10 +297,10 @@ COPY .git .
 RUN make prepare

 ## Build the binary
-## If it's CUDA, we want to skip some of the llama-compat backends to save space
-## We only leave the most CPU-optimized variant and the fallback for the cublas build
-## (both will use CUDA for the actual computation)
-RUN if [ "${BUILD_TYPE}" = "cublas" ]; then \
+## If it's CUDA or hipblas, we want to skip some of the llama-compat backends to save space
+## We only leave the most CPU-optimized variant and the fallback for the cublas/hipblas build
+## (both will use CUDA or hipblas for the actual computation)
+RUN if [ "${BUILD_TYPE}" = "cublas" ] || [ "${BUILD_TYPE}" = "hipblas" ]; then \
        SKIP_GRPC_BACKEND="backend-assets/grpc/llama-cpp-avx backend-assets/grpc/llama-cpp-avx2" make build; \
    else \
        make build; \