LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2025-07-02 20:10:47 +00:00

Go to file

Ettore Di Giacinto bfdc29d316

build backend container images / backend-jobs (bark, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f16-bark) (push) Waiting to run

Details

build backend container images / backend-jobs (bark, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f32-bark) (push) Waiting to run

Details

build backend container images / backend-jobs (bark, rocm/dev-ubuntu-22.04:6.1, hipblas, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-rocm-hipblas-bark) (push) Waiting to run

Details

build backend container images / backend-jobs (bark, ubuntu:22.04, , ./, , , ./backend/Dockerfile.go, linux/amd64, ubuntu-latest, true, -bark-cpp) (push) Waiting to run

Details

build backend container images / backend-jobs (bark, ubuntu:22.04, cublas, ./backend, 11, 7, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-11-bark) (push) Waiting to run

Details

build backend container images / backend-jobs (bark, ubuntu:22.04, cublas, ./backend, 12, 0, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-12-bark) (push) Waiting to run

Details

build backend container images / backend-jobs (chatterbox, ubuntu:22.04, cublas, ./backend, 11, 7, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-11-chatterbox) (push) Waiting to run

Details

build backend container images / backend-jobs (chatterbox, ubuntu:22.04, cublas, ./backend, 12, 0, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-12-chatterbox) (push) Waiting to run

Details

build backend container images / backend-jobs (coqui, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f16-coqui) (push) Waiting to run

Details

build backend container images / backend-jobs (coqui, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f32-coqui) (push) Waiting to run

Details

build backend container images / backend-jobs (coqui, rocm/dev-ubuntu-22.04:6.1, hipblas, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-rocm-hipblas-coqui) (push) Waiting to run

Details

build backend container images / backend-jobs (coqui, ubuntu:22.04, cublas, ./backend, 11, 7, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-11-coqui) (push) Waiting to run

Details

build backend container images / backend-jobs (coqui, ubuntu:22.04, cublas, ./backend, 12, 0, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-12-coqui) (push) Waiting to run

Details

build backend container images / backend-jobs (diffusers, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f32-diffusers) (push) Waiting to run

Details

build backend container images / backend-jobs (diffusers, rocm/dev-ubuntu-22.04:6.1, hipblas, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-rocm-hipblas-diffusers) (push) Waiting to run

Details

build backend container images / backend-jobs (diffusers, ubuntu:22.04, cublas, ./backend, 11, 7, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-11-diffusers) (push) Waiting to run

Details

build backend container images / backend-jobs (diffusers, ubuntu:22.04, cublas, ./backend, 12, 0, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-12-diffusers) (push) Waiting to run

Details

build backend container images / backend-jobs (faster-whisper, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f16-faster-whisper) (push) Waiting to run

Details

build backend container images / backend-jobs (faster-whisper, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f32-faster-whisper) (push) Waiting to run

Details

build backend container images / backend-jobs (faster-whisper, rocm/dev-ubuntu-22.04:6.1, hipblas, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-rocm-hipblas-faster-whisper) (push) Waiting to run

Details

build backend container images / backend-jobs (faster-whisper, ubuntu:22.04, cublas, ./backend, 11, 7, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-11-faster-whisper) (push) Waiting to run

Details

build backend container images / backend-jobs (faster-whisper, ubuntu:22.04, cublas, ./backend, 12, 0, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-12-faster-whisper) (push) Waiting to run

Details

build backend container images / backend-jobs (kokoro, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f16-kokoro) (push) Waiting to run

Details

build backend container images / backend-jobs (kokoro, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f32-kokoro) (push) Waiting to run

Details

build backend container images / backend-jobs (kokoro, rocm/dev-ubuntu-22.04:6.1, hipblas, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-rocm-hipblas-kokoro) (push) Waiting to run

Details

build backend container images / backend-jobs (kokoro, ubuntu:22.04, cublas, ./backend, 11, 7, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-11-kokoro) (push) Waiting to run

Details

build backend container images / backend-jobs (kokoro, ubuntu:22.04, cublas, ./backend, 12, 0, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-12-kokoro) (push) Waiting to run

Details

build backend container images / backend-jobs (rerankers, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f16-rerankers) (push) Waiting to run

Details

build backend container images / backend-jobs (rerankers, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f32-rerankers) (push) Waiting to run

Details

build backend container images / backend-jobs (rerankers, rocm/dev-ubuntu-22.04:6.1, hipblas, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-rocm-hipblas-rerankers) (push) Waiting to run

Details

build backend container images / backend-jobs (rerankers, ubuntu:22.04, cublas, ./backend, 11, 7, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-11-rerankers) (push) Waiting to run

Details

build backend container images / backend-jobs (rerankers, ubuntu:22.04, cublas, ./backend, 12, 0, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-12-rerankers) (push) Waiting to run

Details

build backend container images / backend-jobs (transformers, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f16-transformers) (push) Waiting to run

Details

build backend container images / backend-jobs (transformers, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f32-transformers) (push) Waiting to run

Details

build backend container images / backend-jobs (transformers, rocm/dev-ubuntu-22.04:6.1, hipblas, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-rocm-hipblas-transformers) (push) Waiting to run

Details

build backend container images / backend-jobs (transformers, ubuntu:22.04, cublas, ./backend, 11, 7, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-11-transformers) (push) Waiting to run

Details

build backend container images / backend-jobs (transformers, ubuntu:22.04, cublas, ./backend, 12, 0, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-12-transformers) (push) Waiting to run

Details

build backend container images / backend-jobs (vllm, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f16-vllm) (push) Waiting to run

Details

build backend container images / backend-jobs (vllm, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-intel-sycl-f32-vllm) (push) Waiting to run

Details

build backend container images / backend-jobs (vllm, rocm/dev-ubuntu-22.04:6.1, hipblas, ./backend, , , ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-rocm-hipblas-vllm) (push) Waiting to run

Details

build backend container images / backend-jobs (vllm, ubuntu:22.04, cublas, ./backend, 11, 7, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-11-vllm) (push) Waiting to run

Details

build backend container images / backend-jobs (vllm, ubuntu:22.04, cublas, ./backend, 12, 0, ./backend/Dockerfile.python, linux/amd64, ubuntu-latest, true, -gpu-nvidia-cuda-12-vllm) (push) Waiting to run

Details

Explorer deployment / build-linux (push) Waiting to run

Details

GPU tests / ubuntu-latest (1.21.x) (push) Waiting to run

Details

generate and publish intel docker caches / generate_caches (intel/oneapi-basekit:2025.1.0-0-devel-ubuntu22.04, linux/amd64, ubuntu-latest) (push) Waiting to run

Details

build container images / hipblas-jobs (-aio-gpu-hipblas, rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, --jobs=3 --output-sync=target, linux/amd64, ubuntu-latest, auto, -gpu-hipblas) (push) Waiting to run

Details

build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, ubuntu-latest, false, auto, ) (push) Waiting to run

Details

build container images / core-image-build (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, --jobs=3 --output-sync=target, linux/amd64, ubuntu-latest, auto, -gpu-intel-f16) (push) Waiting to run

Details

build container images / core-image-build (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, --jobs=3 --output-sync=target, linux/amd64, ubuntu-latest, auto, -gpu-intel-f32) (push) Waiting to run

Details

build container images / core-image-build (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, --jobs=4 --output-sync=target, linux/amd64, ubuntu-latest, false, auto, -gpu-nvidia-cuda11) (push) Waiting to run

Details

build container images / core-image-build (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, --jobs=4 --output-sync=target, linux/amd64, ubuntu-latest, false, auto, -gpu-nvidia-cuda12) (push) Waiting to run

Details

build container images / core-image-build (-aio-gpu-vulkan, ubuntu:22.04, vulkan, true, --jobs=4 --output-sync=target, linux/amd64, ubuntu-latest, false, auto, -vulkan) (push) Waiting to run

Details

build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, auto, -nvidia-l4t-arm64) (push) Waiting to run

Details

Security Scan / tests (push) Waiting to run

Details

Tests extras backends / tests-transformers (push) Waiting to run

Details

Tests extras backends / tests-rerankers (push) Waiting to run

Details

Tests extras backends / tests-diffusers (push) Waiting to run

Details

Tests extras backends / tests-coqui (push) Waiting to run

Details

tests / tests-linux (1.21.x) (push) Waiting to run

Details

tests / tests-aio-container (push) Waiting to run

Details

tests / tests-apple (1.21.x) (push) Waiting to run

Details

fix(gallery): correctly show status for downloading OCI images (#5774 )

We can't use the mutate.Extract written bytes as current status as that
will be bigger than the compressed image size. Image manifest don't have
any guarantee of the type of artifact (can be compressed or not) when
showing the layer size.

Split the extraction process in two parts: Downloading and extracting as
a flattened system, in this way we can display the status of downloading
and extracting accordingly.

This change also fixes a small nuance in detecting installed backends,
now it's more consistent and looks if a metadata.json and/or a path with
a `run.sh` file is present.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2025-07-02 08:25:48 +02:00

.devcontainer

chore: drop extras references from docs

2025-06-19 22:04:28 +02:00

.devcontainer-scripts

test: preliminary tests and merge fix for authv2 (#3584 )

2024-09-24 09:32:48 +02:00

.github

fix(ci): enable tag-latest to auto (#5738 )

2025-06-27 18:17:01 +02:00

.vscode

chore(stablediffusion-ncn): drop in favor of ggml implementation (#4652 )

2025-01-22 19:34:16 +01:00

aio

feat: ⚠️ reduce images size and stop bundling sources (#5721 )

2025-06-26 18:41:38 +02:00

backend

fix: Diffusers and XPU fixes (#5737 )

2025-07-01 12:36:17 +02:00

configuration

refactor: move remaining api packages to core (#1731 )

2024-03-01 16:19:53 +01:00

core

fix(gallery): correctly show status for downloading OCI images (#5774 )

2025-07-02 08:25:48 +02:00

custom-ca-certs

feat(certificates): add support for custom CA certificates (#880 )

2023-11-01 20:10:14 +01:00

docs

docs: ⬆️ update docs version mudler/LocalAI (#5775 )

2025-07-01 22:45:24 +00:00

examples

chore: create examples/README to redirect to the new repository

2024-10-30 09:11:32 +01:00

gallery

chore(model gallery): add pinkpixel_crystal-think-v2 (#5773 )

2025-07-01 16:18:19 +02:00

internal

feat: cleanups, small enhancements

2023-07-04 18:58:19 +02:00

models

Add docker-compose

2023-04-13 01:13:14 +02:00

pkg

fix(gallery): correctly show status for downloading OCI images (#5774 )

2025-07-02 08:25:48 +02:00

prompt-templates

Requested Changes from GPT4ALL to Luna-AI-Llama2 (#1092 )

2023-09-22 11:22:17 +02:00

scripts

chore(scripts): allow to specify quants (#5430 )

2025-05-22 11:53:30 +02:00

swagger

feat: Add backend gallery (#5607 )

2025-06-15 14:56:52 +02:00

tests

feat: Add backend gallery (#5607 )

2025-06-15 14:56:52 +02:00

.dockerignore

feat: Initial Version of vscode DevContainer (#3217 )

2024-08-14 09:06:41 +02:00

.editorconfig

feat(stores): Vector store backend (#1795 )

2024-03-22 21:14:04 +01:00

.env

chore(deps): bump llama.cpp to '1d36b3670b285e69e58b9d687c770a2a0a192194 (#5307 )

2025-05-03 18:44:40 +02:00

.gitattributes

chore(linguist): add *.hpp files to linguist-vendored (#4154 )

2024-11-14 14:12:16 +01:00

.gitignore

feat(bark-cpp): add new bark.cpp backend (#4287 )

2024-11-28 22:16:44 +01:00

.gitmodules

docs/examples: enhancements (#1572 )

2024-01-18 19:41:08 +01:00

.yamllint

fix: yamlint warnings and errors (#2131 )

2024-04-25 17:25:56 +00:00

assets.go

fix: use rice when embedding large binaries (#5309 )

2025-05-04 16:42:42 +02:00

CONTRIBUTING.md

Update CONTRIBUTING.md (#3723 )

2024-10-03 20:03:35 +02:00

docker-compose.yaml

feat: Initial Version of vscode DevContainer (#3217 )

2024-08-14 09:06:41 +02:00

Dockerfile

feat: ⚠️ reduce images size and stop bundling sources (#5721 )

2025-06-26 18:41:38 +02:00

Dockerfile.aio

feat(aio): entrypoint, update workflows (#1872 )

2024-03-21 22:09:04 +01:00

Earthfile

Rename project to LocalAI (#35 )

2023-04-19 18:43:10 +02:00

Entitlements.plist

Feat: OSX Local Codesigning (#1319 )

2023-11-23 15:22:54 +01:00

entrypoint.sh

feat: ⚠️ reduce images size and stop bundling sources (#5721 )

2025-06-26 18:41:38 +02:00

go.mod

feat: Add backend gallery (#5607 )

2025-06-15 14:56:52 +02:00

go.sum

feat: Add backend gallery (#5607 )

2025-06-15 14:56:52 +02:00

LICENSE

chore(docs): update license year

2025-02-15 18:17:15 +01:00

main.go

feat: Add backend gallery (#5607 )

2025-06-15 14:56:52 +02:00

Makefile

chore: ⬆️ Update ggml-org/llama.cpp to de569441470332ff922c23fb0413cc957be75b25 (#5777 )

2025-07-02 08:25:29 +02:00

README.md

fix: Diffusers and XPU fixes (#5737 )

2025-07-01 12:36:17 +02:00

renovate.json

ci: manually update deps

2023-05-04 15:01:29 +02:00

SECURITY.md

Create SECURITY.md

2024-02-29 19:53:04 +01:00

webui_static.yaml

chore: drop embedded models (#4715 )

2025-01-30 00:03:01 +01:00

README.md

💡 Get help - ❓FAQ 💭Discussions 💬 Discord 📖 Documentation website

💻 Quickstart 🖼️ Models 🚀 Roadmap 🥽 Demo 🌍 Explorer 🛫 Examples Try on

LocalAI is the free, Open Source OpenAI alternative. LocalAI act as a drop-in replacement REST API that's compatible with OpenAI (Elevenlabs, Anthropic... ) API specifications for local AI inferencing. It allows you to run LLMs, generate images, audio (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families. Does not require GPU. It is created and maintained by Ettore Di Giacinto.

📚🆕 Local Stack Family

🆕 LocalAI is now part of a comprehensive suite of AI tools designed to work together:

LocalAGI

A powerful Local AI agent management platform that serves as a drop-in replacement for OpenAI's Responses API, enhanced with advanced agentic capabilities.

LocalRecall

A REST-ful API and knowledge base management system that provides persistent memory and storage capabilities for AI agents.

Screenshots

Talk Interface	Generate Audio

Models Overview	Generate Images

Chat Interface	Home

Login	Swarm

💻 Quickstart

Run the installer script:

# Basic installation
curl https://localai.io/install.sh | sh

For more installation options, see Installer Options.

Or run with docker:

CPU only image:

docker run -ti --name local-ai -p 8080:8080 localai/localai:latest

NVIDIA GPU Images:

# CUDA 12.0
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12

# CUDA 11.7
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-11

# NVIDIA Jetson (L4T) ARM64
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64

AMD GPU Images (ROCm):

docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-gpu-hipblas

Intel GPU Images (oneAPI):

# Intel GPU with FP16 support
docker run -ti --name local-ai -p 8080:8080 --device=/dev/dri/card1 --device=/dev/dri/renderD128 localai/localai:latest-gpu-intel-f16

# Intel GPU with FP32 support
docker run -ti --name local-ai -p 8080:8080 --device=/dev/dri/card1 --device=/dev/dri/renderD128 localai/localai:latest-gpu-intel-f32

Vulkan GPU Images:

docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-vulkan

AIO Images (pre-downloaded models):

# CPU version
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu

# NVIDIA CUDA 12 version
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-12

# NVIDIA CUDA 11 version
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-11

# Intel GPU version
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-gpu-intel-f16

# AMD GPU version
docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-aio-gpu-hipblas

For more information about the AIO images and pre-downloaded models, see Container Documentation.

To load models:

# From the model gallery (see available models with `local-ai models list`, in the WebUI from the model tab, or visiting https://models.localai.io)
local-ai run llama-3.2-1b-instruct:q4_k_m
# Start LocalAI with the phi-2 model directly from huggingface
local-ai run huggingface://TheBloke/phi-2-GGUF/phi-2.Q8_0.gguf
# Install and run a model from the Ollama OCI registry
local-ai run ollama://gemma:2b
# Run a model from a configuration file
local-ai run https://gist.githubusercontent.com/.../phi-2.yaml
# Install and run a model from a standard OCI registry (e.g., Docker Hub)
local-ai run oci://localai/phi-2:latest

For more information, see 💻 Getting started

📰 Latest project news

June 2025: Backend management has been added. Attention: extras images are going to be deprecated from the next release! Read the backend management PR.
May 2025: Audio input and Reranking in llama.cpp backend, Realtime API, Support to Gemma, SmollVLM, and more multimodal models (available in the gallery).
May 2025: Important: image name changes See release
Apr 2025: Rebrand, WebUI enhancements
Apr 2025: LocalAGI and LocalRecall join the LocalAI family stack.
Apr 2025: WebUI overhaul, AIO images updates
Feb 2025: Backend cleanup, Breaking changes, new backends (kokoro, OutelTTS, faster-whisper), Nvidia L4T images
Jan 2025: LocalAI model release: https://huggingface.co/mudler/LocalAI-functioncall-phi-4-v0.3, SANA support in diffusers: https://github.com/mudler/LocalAI/pull/4603
Dec 2024: stablediffusion.cpp backend (ggml) added ( https://github.com/mudler/LocalAI/pull/4289 )
Nov 2024: Bark.cpp backend added ( https://github.com/mudler/LocalAI/pull/4287 )
Nov 2024: Voice activity detection models (VAD) added to the API: https://github.com/mudler/LocalAI/pull/4204
Oct 2024: examples moved to LocalAI-examples
Aug 2024: 🆕 FLUX-1, P2P Explorer
July 2024: 🔥🔥 🆕 P2P Dashboard, LocalAI Federated mode and AI Swarms: https://github.com/mudler/LocalAI/pull/2723. P2P Global community pools: https://github.com/mudler/LocalAI/issues/3113
May 2024: 🔥🔥 Decentralized P2P llama.cpp: https://github.com/mudler/LocalAI/pull/2343 (peer2peer llama.cpp!) 👉 Docs https://localai.io/features/distribute/
May 2024: 🔥🔥 Distributed inferencing: https://github.com/mudler/LocalAI/pull/2324
April 2024: Reranker API: https://github.com/mudler/LocalAI/pull/2121

Roadmap items: List of issues

🚀 Features

🧩 Backend Gallery: Install/remove backends on the fly, powered by OCI images — fully customizable and API-driven.
📖 Text generation with GPTs (llama.cpp, transformers, vllm ... 📖 and more)
🗣 Text to Audio
🔈 Audio to Text (Audio transcription with whisper.cpp)
🎨 Image generation
🔥 OpenAI-alike tools API
🧠 Embeddings generation for vector databases
✍️ Constrained grammars
🖼️ Download Models directly from Huggingface
🥽 Vision API
📈 Reranker API
🆕🖧 P2P Inferencing
Agentic capabilities
🔊 Voice activity detection (Silero-VAD support)
🌍 Integrated WebUI!

🔗 Community and integrations

Build and deploy custom containers:

https://github.com/sozercan/aikit

WebUIs:

https://github.com/Jirubizu/localai-admin
https://github.com/go-skynet/LocalAI-frontend
QA-Pilot(An interactive chat project that leverages LocalAI LLMs for rapid understanding and navigation of GitHub code repository) https://github.com/reid41/QA-Pilot

Model galleries

https://github.com/go-skynet/model-gallery

Other:

Helm chart https://github.com/go-skynet/helm-charts
VSCode extension https://github.com/badgooooor/localai-vscode-plugin
Langchain: https://python.langchain.com/docs/integrations/providers/localai/
Terminal utility https://github.com/djcopley/ShellOracle
Local Smart assistant https://github.com/mudler/LocalAGI
Home Assistant https://github.com/sammcj/homeassistant-localai / https://github.com/drndos/hass-openai-custom-conversation / https://github.com/valentinfrlch/ha-gpt4vision
Discord bot https://github.com/mudler/LocalAGI/tree/main/examples/discord
Slack bot https://github.com/mudler/LocalAGI/tree/main/examples/slack
Shell-Pilot(Interact with LLM using LocalAI models via pure shell scripts on your Linux or MacOS system) https://github.com/reid41/shell-pilot
Telegram bot https://github.com/mudler/LocalAI/tree/master/examples/telegram-bot
Another Telegram Bot https://github.com/JackBekket/Hellper
Auto-documentation https://github.com/JackBekket/Reflexia
Github bot which answer on issues, with code and documentation as context https://github.com/JackBekket/GitHelper
Github Actions: https://github.com/marketplace/actions/start-localai
Examples: https://github.com/mudler/LocalAI/tree/master/examples/

🔗 Resources

LLM finetuning guide
How to build locally
How to install in Kubernetes
Projects integrating LocalAI
How tos section (curated by our community)

Citation

If you utilize this repository, data in a downstream project, please consider citing it with:

@misc{localai,
  author = {Ettore Di Giacinto},
  title = {LocalAI: The free, Open source OpenAI alternative},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/go-skynet/LocalAI}},

❤️ Sponsors

Do you find LocalAI useful?

Support the project by becoming a backer or sponsor. Your logo will show up here with a link to your website.

A huge thank you to our generous sponsors who support this project covering CI expenses, and our Sponsor list:

🌟 Star history

📖 License

LocalAI is a community-driven project created by Ettore Di Giacinto.

MIT - Author Ettore Di Giacinto mudler@localai.io

🙇 Acknowledgements

LocalAI couldn't have been built without the help of great software already available from the community. Thank you!

🤗 Contributors

This is a community project, a special thanks to our contributors! 🤗

Description

🤖 The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference

ai api audio-generation distributed gemma gpt4all image-generation kubernetes libp2p llama llama3 llm mamba mistral musicgen rerank rwkv stable-diffusion text-generation tts

Readme MIT 114 MiB

Languages

Go 88.3%

Python 3.2%

JavaScript 2.9%

HTML 2.7%

Makefile 1%

Other 1.8%

README.md

📚🆕 Local Stack Family

LocalAGI

LocalRecall

Screenshots

💻 Quickstart

CPU only image:

NVIDIA GPU Images:

AMD GPU Images (ROCm):

Intel GPU Images (oneAPI):

Vulkan GPU Images:

AIO Images (pre-downloaded models):

📰 Latest project news

🚀 Features

🔗 Community and integrations

🔗 Resources

📖 🎥 Media, Blogs, Social

Citation

❤️ Sponsors

🌟 Star history

📖 License

🙇 Acknowledgements

🤗 Contributors