Update distributed_inferencing.md

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
This commit is contained in:
Ettore Di Giacinto 2024-07-22 17:35:10 +02:00 committed by GitHub
parent 7d61de63ae
commit 153e977155
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -11,7 +11,7 @@ This functionality enables LocalAI to distribute inference requests across multi
LocalAI supports two modes of distributed inferencing via p2p:
- **Federated Mode**: Requests are shared between the cluster and routed to a single worker node in the network based on the load balancer's decision.
- **Worker Mode**: Requests are processed by all the workers which contributes to the final inference result (by sharing the model weights).
- **Worker Mode** (aka "model sharding" or "splitting weights"): Requests are processed by all the workers which contributes to the final inference result (by sharing the model weights).
## Usage