LocalAI/docs/content/build/_index.en.md


+++
disableToc = false
title = "Build"
weight = 5
url = '/basics/build/'

+++

### Build

#### Container image

Requirements:

- Docker or podman, or a container engine

In order to build the `LocalAI` container image locally you can use `docker`, for example:

```
# build the image
docker build -t localai .
docker run localai
```

#### Locally

In order to build LocalAI locally, you need the following requirements:

- Golang >= 1.21
- Cmake/make
- GCC
- GRPC

To install the dependencies follow the instructions below:

{{< tabs >}}
{{% tab name="Apple" %}}

```bash
brew install abseil cmake go grpc protobuf wget
```

{{% /tab %}}
{{% tab name="Debian" %}}

```bash
apt install protobuf-compiler-grpc libgrpc-dev make cmake
```

{{% /tab %}}
{{% tab name="From source" %}}

Specify `BUILD_GRPC_FOR_BACKEND_LLAMA=true` to build automatically the gRPC dependencies

```bash
make ... BUILD_GRPC_FOR_BACKEND_LLAMA=true build
```

{{% /tab %}}
{{< /tabs >}}


To build LocalAI with `make`:

```
git clone https://github.com/go-skynet/LocalAI
cd LocalAI
make build
```

This should produce the binary `local-ai`

{{% notice note %}}

#### CPU flagset compatibility


LocalAI uses different backends based on ggml and llama.cpp to run models. If your CPU doesn't support common instruction sets, you can disable them during build:

```
CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_AVX=OFF -DLLAMA_FMA=OFF" make build
```

To have effect on the container image, you need to set `REBUILD=true`:

```
docker run  quay.io/go-skynet/localai
docker run --rm -ti -p 8080:8080 -e DEBUG=true -e MODELS_PATH=/models -e THREADS=1 -e REBUILD=true -e CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_AVX=OFF -DLLAMA_FMA=OFF" -v $PWD/models:/models quay.io/go-skynet/local-ai:latest
```

{{% /notice %}}

### Example: Build on mac

Building on Mac (M1 or M2) works, but you may need to install some prerequisites using `brew`. 

The below has been tested by one mac user and found to work. Note that this doesn't use Docker to run the server:

```
# install build dependencies
brew install abseil cmake go grpc protobuf wget

# clone the repo
git clone https://github.com/go-skynet/LocalAI.git

cd LocalAI

# build the binary
make build

# Download gpt4all-j to models/
wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j

# Use a template from the examples
cp -rf prompt-templates/ggml-gpt4all-j.tmpl models/

# Run LocalAI
./local-ai --models-path=./models/ --debug=true

# Now API is accessible at localhost:8080
curl http://localhost:8080/v1/models

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
     "model": "ggml-gpt4all-j",
     "messages": [{"role": "user", "content": "How are you?"}],
     "temperature": 0.9 
   }'
```

### Build with Image generation support


**Requirements**: OpenCV, Gomp

Image generation is experimental and requires `GO_TAGS=stablediffusion` to be set during build:

```
make GO_TAGS=stablediffusion build
```

### Build with Text to audio support

**Requirements**: piper-phonemize

Text to audio support is experimental and requires `GO_TAGS=tts` to be set during build:

```
make GO_TAGS=tts build
```

### Acceleration

List of the variables available to customize the build:

| Variable | Default | Description |
| ---------------------| ------- | ----------- |
| `BUILD_TYPE`         |   None      | Build type. Available: `cublas`, `openblas`, `clblas`, `metal`,`hipblas` |
| `GO_TAGS`            |   `tts stablediffusion`      | Go tags. Available: `stablediffusion`, `tts` |
| `CLBLAST_DIR`        |         | Specify a CLBlast directory |
| `CUDA_LIBPATH`       |         | Specify a CUDA library path |

#### OpenBLAS

Software acceleration.

Requirements: OpenBLAS

```
make BUILD_TYPE=openblas build
```

#### CuBLAS

Nvidia Acceleration.

Requirement: Nvidia CUDA toolkit

Note: CuBLAS support is experimental, and has not been tested on real HW. please report any issues you find!

```
make BUILD_TYPE=cublas build
```

More informations available in the upstream PR: https://github.com/ggerganov/llama.cpp/pull/1412


#### Hipblas (AMD GPU with ROCm on Arch Linux)

Packages:
```
pacman -S base-devel git rocm-hip-sdk rocm-opencl-sdk opencv clblast grpc
```

Library links:
```
export CGO_CFLAGS="-I/usr/include/opencv4"
export CGO_CXXFLAGS="-I/usr/include/opencv4"
export CGO_LDFLAGS="-L/opt/rocm/hip/lib -lamdhip64 -L/opt/rocm/lib -lOpenCL -L/usr/lib -lclblast -lrocblas -lhipblas -lrocrand -lomp -O3 --rtlib=compiler-rt -unwindlib=libgcc -lhipblas -lrocblas --hip-link"
```

Build:
```
make BUILD_TYPE=hipblas GPU_TARGETS=gfx1030
```

#### ClBLAS

AMD/Intel GPU acceleration.

Requirement: OpenCL, CLBlast

```
make BUILD_TYPE=clblas build
```

To specify a clblast dir set: `CLBLAST_DIR`

### Metal (Apple Silicon)

```
make BUILD_TYPE=metal build

# Set `gpu_layers: 1` to your YAML model config file and `f16: true`
# Note: only models quantized with q4_0 are supported!
```

### Build only a single backend

You can control the backends that are built by setting the `GRPC_BACKENDS` environment variable. For instance, to build only the `llama-cpp` backend only:

```bash
make GRPC_BACKENDS=backend-assets/grpc/llama-cpp build
```

By default, all the backends are built.

### Windows compatibility

Make sure to give enough resources to the running container. See https://github.com/go-skynet/LocalAI/issues/2
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
			`+++`
			`disableToc = false`
			`title = "Build"`
			`weight = 5`
			`url = '/basics/build/'`

			`+++`

feat: share models by url (#1522) * feat: allow to pass by models via args * expose it also as an env/arg * docs: enhancements to build/requirements * do not display status always * print download status * not all mesages are debug 2024-01-01 09:31:03 +00:00			`### Build`

			`#### Container image`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
			`Requirements:`

feat: share models by url (#1522) * feat: allow to pass by models via args * expose it also as an env/arg * docs: enhancements to build/requirements * do not display status always * print download status * not all mesages are debug 2024-01-01 09:31:03 +00:00			`- Docker or podman, or a container engine`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
feat: share models by url (#1522) * feat: allow to pass by models via args * expose it also as an env/arg * docs: enhancements to build/requirements * do not display status always * print download status * not all mesages are debug 2024-01-01 09:31:03 +00:00			In order to build the `LocalAI` container image locally you can use `docker`, for example:
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
			```
			`# build the image`
			`docker build -t localai .`
			`docker run localai`
			```

feat: share models by url (#1522) * feat: allow to pass by models via args * expose it also as an env/arg * docs: enhancements to build/requirements * do not display status always * print download status * not all mesages are debug 2024-01-01 09:31:03 +00:00			`#### Locally`

			`In order to build LocalAI locally, you need the following requirements:`

			`- Golang >= 1.21`
			`- Cmake/make`
			`- GCC`
			`- GRPC`

			`To install the dependencies follow the instructions below:`

			`{{< tabs >}}`
			`{{% tab name="Apple" %}}`

			```bash
			`brew install abseil cmake go grpc protobuf wget`
			```

			`{{% /tab %}}`
			`{{% tab name="Debian" %}}`

			```bash
			`apt install protobuf-compiler-grpc libgrpc-dev make cmake`
			```

			`{{% /tab %}}`
			`{{% tab name="From source" %}}`

			Specify `BUILD_GRPC_FOR_BACKEND_LLAMA=true` to build automatically the gRPC dependencies

			```bash
			`make ... BUILD_GRPC_FOR_BACKEND_LLAMA=true build`
			```

			`{{% /tab %}}`
			`{{< /tabs >}}`


			To build LocalAI with `make`:
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
			```
			`git clone https://github.com/go-skynet/LocalAI`
			`cd LocalAI`
			`make build`
			```

feat: share models by url (#1522) * feat: allow to pass by models via args * expose it also as an env/arg * docs: enhancements to build/requirements * do not display status always * print download status * not all mesages are debug 2024-01-01 09:31:03 +00:00			This should produce the binary `local-ai`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
			`{{% notice note %}}`

			`#### CPU flagset compatibility`


			`LocalAI uses different backends based on ggml and llama.cpp to run models. If your CPU doesn't support common instruction sets, you can disable them during build:`

			```
			`CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_AVX=OFF -DLLAMA_FMA=OFF" make build`
			```

			To have effect on the container image, you need to set `REBUILD=true`:

			```
			`docker run quay.io/go-skynet/localai`
			`docker run --rm -ti -p 8080:8080 -e DEBUG=true -e MODELS_PATH=/models -e THREADS=1 -e REBUILD=true -e CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_AVX=OFF -DLLAMA_FMA=OFF" -v $PWD/models:/models quay.io/go-skynet/local-ai:latest`
			```

			`{{% /notice %}}`

feat: share models by url (#1522) * feat: allow to pass by models via args * expose it also as an env/arg * docs: enhancements to build/requirements * do not display status always * print download status * not all mesages are debug 2024-01-01 09:31:03 +00:00			`### Example: Build on mac`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
			Building on Mac (M1 or M2) works, but you may need to install some prerequisites using `brew`.

			`The below has been tested by one mac user and found to work. Note that this doesn't use Docker to run the server:`

			```
			`# install build dependencies`
			`brew install abseil cmake go grpc protobuf wget`

			`# clone the repo`
			`git clone https://github.com/go-skynet/LocalAI.git`

			`cd LocalAI`

			`# build the binary`
			`make build`

			`# Download gpt4all-j to models/`
			`wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j`

			`# Use a template from the examples`
			`cp -rf prompt-templates/ggml-gpt4all-j.tmpl models/`

			`# Run LocalAI`
			`./local-ai --models-path=./models/ --debug=true`

			`# Now API is accessible at localhost:8080`
			`curl http://localhost:8080/v1/models`

			`curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{`
			`"model": "ggml-gpt4all-j",`
			`"messages": [{"role": "user", "content": "How are you?"}],`
			`"temperature": 0.9`
			`}'`
			```

			`### Build with Image generation support`


			`Requirements: OpenCV, Gomp`

			Image generation is experimental and requires `GO_TAGS=stablediffusion` to be set during build:

			```
			`make GO_TAGS=stablediffusion build`
			```

			`### Build with Text to audio support`

			`Requirements: piper-phonemize`

			Text to audio support is experimental and requires `GO_TAGS=tts` to be set during build:

			```
			`make GO_TAGS=tts build`
			```

			`### Acceleration`

			`List of the variables available to customize the build:`

			`\| Variable \| Default \| Description \|`
			`\| ---------------------\| ------- \| ----------- \|`
			\| `BUILD_TYPE` \| None \| Build type. Available: `cublas`, `openblas`, `clblas`, `metal`,`hipblas` \|
			\| `GO_TAGS` \| `tts stablediffusion` \| Go tags. Available: `stablediffusion`, `tts` \|
			\| `CLBLAST_DIR` \| \| Specify a CLBlast directory \|
			\| `CUDA_LIBPATH` \| \| Specify a CUDA library path \|

			`#### OpenBLAS`

			`Software acceleration.`

			`Requirements: OpenBLAS`

			```
			`make BUILD_TYPE=openblas build`
			```

			`#### CuBLAS`

			`Nvidia Acceleration.`

			`Requirement: Nvidia CUDA toolkit`

			`Note: CuBLAS support is experimental, and has not been tested on real HW. please report any issues you find!`

			```
			`make BUILD_TYPE=cublas build`
			```

			`More informations available in the upstream PR: https://github.com/ggerganov/llama.cpp/pull/1412`


Documentation for Hipblas (#1425) hiplas arch 2023-12-12 14:05:01 +00:00			`#### Hipblas (AMD GPU with ROCm on Arch Linux)`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
Documentation for Hipblas (#1425) hiplas arch 2023-12-12 14:05:01 +00:00			`Packages:`
			```
			`pacman -S base-devel git rocm-hip-sdk rocm-opencl-sdk opencv clblast grpc`
			```
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
Documentation for Hipblas (#1425) hiplas arch 2023-12-12 14:05:01 +00:00			`Library links:`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00			```
Documentation for Hipblas (#1425) hiplas arch 2023-12-12 14:05:01 +00:00			`export CGO_CFLAGS="-I/usr/include/opencv4"`
			`export CGO_CXXFLAGS="-I/usr/include/opencv4"`
			`export CGO_LDFLAGS="-L/opt/rocm/hip/lib -lamdhip64 -L/opt/rocm/lib -lOpenCL -L/usr/lib -lclblast -lrocblas -lhipblas -lrocrand -lomp -O3 --rtlib=compiler-rt -unwindlib=libgcc -lhipblas -lrocblas --hip-link"`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00			```

Documentation for Hipblas (#1425) hiplas arch 2023-12-12 14:05:01 +00:00			`Build:`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00			```
Documentation for Hipblas (#1425) hiplas arch 2023-12-12 14:05:01 +00:00			`make BUILD_TYPE=hipblas GPU_TARGETS=gfx1030`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00			```

			`#### ClBLAS`

			`AMD/Intel GPU acceleration.`

			`Requirement: OpenCL, CLBlast`

			```
			`make BUILD_TYPE=clblas build`
			```

			To specify a clblast dir set: `CLBLAST_DIR`

			`### Metal (Apple Silicon)`

			```
			`make BUILD_TYPE=metal build`

			# Set `gpu_layers: 1` to your YAML model config file and `f16: true`
			`# Note: only models quantized with q4_0 are supported!`
			```

feat: share models by url (#1522) * feat: allow to pass by models via args * expose it also as an env/arg * docs: enhancements to build/requirements * do not display status always * print download status * not all mesages are debug 2024-01-01 09:31:03 +00:00			`### Build only a single backend`

			You can control the backends that are built by setting the `GRPC_BACKENDS` environment variable. For instance, to build only the `llama-cpp` backend only:

			```bash
			`make GRPC_BACKENDS=backend-assets/grpc/llama-cpp build`
			```

			`By default, all the backends are built.`

docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00			`### Windows compatibility`

			`Make sure to give enough resources to the running container. See https://github.com/go-skynet/LocalAI/issues/2`