{{template "views/partials/head" .}}
{{template "views/partials/navbar" .}}

Distributed inference with P2P

LocalAI uses P2P technologies to enable distribution of work between peers. It is possible to share an instance with Federation and/or split the weights of a model across peers (only available with llama.cpp models). You can now share computational resources between your devices or your friends!
{{ if and .IsP2PEnabled (eq .P2PToken "") }}

Warning: P2P mode is disabled or no token was specified

You have to enable P2P mode by starting LocalAI with --p2p. Please restart the server with --p2p to generate a new token automatically that can be used to automatically discover other nodes. If you already have a token specify it with export TOKEN=".." Check out the documentation for more information.

{{ else }}

Federated Nodes:

You can start LocalAI in federated mode to share your instance, or start the federated server to balance requests between nodes of the federation.


Start a federated instance

Workers (llama.cpp):

You can start llama.cpp workers to distribute weights between the workers and offload part of the computation. To start a new worker, you can use the CLI or Docker.


Start a new llama.cpp P2P worker

{{ end }}
{{template "views/partials/footer" .}}