From 51cba89682b40ae92737fa47ce6bdbce9ba8cac6 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 23 Sep 2024 11:49:07 +0200 Subject: [PATCH] fix(hipblas): do not push all variants to hipblas builds (#3630) Like with CUDA builds, we don't need all the variants when we are compiling against the accelerated variants - in this way we save space and we avoid to exceed embedFS golang size limits. Signed-off-by: Ettore Di Giacinto --- Dockerfile | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Dockerfile b/Dockerfile index f08cb9a0..323c3d9a 100644 --- a/Dockerfile +++ b/Dockerfile @@ -297,10 +297,10 @@ COPY .git . RUN make prepare ## Build the binary -## If it's CUDA, we want to skip some of the llama-compat backends to save space -## We only leave the most CPU-optimized variant and the fallback for the cublas build -## (both will use CUDA for the actual computation) -RUN if [ "${BUILD_TYPE}" = "cublas" ]; then \ +## If it's CUDA or hipblas, we want to skip some of the llama-compat backends to save space +## We only leave the most CPU-optimized variant and the fallback for the cublas/hipblas build +## (both will use CUDA or hipblas for the actual computation) +RUN if [ "${BUILD_TYPE}" = "cublas" ] || [ "${BUILD_TYPE}" = "hipblas" ]; then \ SKIP_GRPC_BACKEND="backend-assets/grpc/llama-cpp-avx backend-assets/grpc/llama-cpp-avx2" make build; \ else \ make build; \