mirror of
https://github.com/mudler/LocalAI.git
synced 2024-12-23 14:32:25 +00:00
Update distributed_inferencing.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
This commit is contained in:
parent
7d61de63ae
commit
153e977155
@ -11,7 +11,7 @@ This functionality enables LocalAI to distribute inference requests across multi
|
||||
LocalAI supports two modes of distributed inferencing via p2p:
|
||||
|
||||
- **Federated Mode**: Requests are shared between the cluster and routed to a single worker node in the network based on the load balancer's decision.
|
||||
- **Worker Mode**: Requests are processed by all the workers which contributes to the final inference result (by sharing the model weights).
|
||||
- **Worker Mode** (aka "model sharding" or "splitting weights"): Requests are processed by all the workers which contributes to the final inference result (by sharing the model weights).
|
||||
|
||||
## Usage
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user