mirror of
https://github.com/mudler/LocalAI.git
synced 2024-12-21 21:47:51 +00:00
24d7dadfed
* feat: migrate to alecthomas/kong for CLI Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: bring in new flag for granular log levels Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * chore: go mod tidy Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: allow loading cli flag values from ["./localai.yaml", "~/.config/localai.yaml", "/etc/localai.yaml"] in that order Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: load from .env file instead of a yaml file Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: better loading for environment files Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat(doc): add initial documentation about configuration Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: remove test log lines Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: integrate new documentation into existing pages Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: add documentation on .env files Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: cleanup some documentation table errors Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: refactor CLI logic out to it's own package under core/cli Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
89 lines
3.5 KiB
Bash
89 lines
3.5 KiB
Bash
## Set number of threads.
|
|
## Note: prefer the number of physical cores. Overbooking the CPU degrades performance notably.
|
|
# LOCALAI_THREADS=14
|
|
|
|
## Specify a different bind address (defaults to ":8080")
|
|
# LOCALAI_ADDRESS=127.0.0.1:8080
|
|
|
|
## Default models context size
|
|
# LOCALAI_CONTEXT_SIZE=512
|
|
#
|
|
## Define galleries.
|
|
## models will to install will be visible in `/models/available`
|
|
# LOCALAI_GALLERIES=[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}]
|
|
|
|
## CORS settings
|
|
# LOCALAI_CORS=true
|
|
# LOCALAI_CORS_ALLOW_ORIGINS=*
|
|
|
|
## Default path for models
|
|
#
|
|
# LOCALAI_MODELS_PATH=/models
|
|
|
|
## Enable debug mode
|
|
# LOCALAI_LOG_LEVEL=debug
|
|
|
|
## Disables COMPEL (Diffusers)
|
|
# COMPEL=0
|
|
|
|
## Enable/Disable single backend (useful if only one GPU is available)
|
|
# LOCALAI_SINGLE_ACTIVE_BACKEND=true
|
|
|
|
## Specify a build type. Available: cublas, openblas, clblas.
|
|
## cuBLAS: This is a GPU-accelerated version of the complete standard BLAS (Basic Linear Algebra Subprograms) library. It's provided by Nvidia and is part of their CUDA toolkit.
|
|
## OpenBLAS: This is an open-source implementation of the BLAS library that aims to provide highly optimized code for various platforms. It includes support for multi-threading and can be compiled to use hardware-specific features for additional performance. OpenBLAS can run on many kinds of hardware, including CPUs from Intel, AMD, and ARM.
|
|
## clBLAS: This is an open-source implementation of the BLAS library that uses OpenCL, a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors. clBLAS is designed to take advantage of the parallel computing power of GPUs but can also run on any hardware that supports OpenCL. This includes hardware from different vendors like Nvidia, AMD, and Intel.
|
|
# BUILD_TYPE=openblas
|
|
|
|
## Uncomment and set to true to enable rebuilding from source
|
|
# REBUILD=true
|
|
|
|
## Enable go tags, available: stablediffusion, tts
|
|
## stablediffusion: image generation with stablediffusion
|
|
## tts: enables text-to-speech with go-piper
|
|
## (requires REBUILD=true)
|
|
#
|
|
# GO_TAGS=stablediffusion
|
|
|
|
## Path where to store generated images
|
|
# LOCALAI_IMAGE_PATH=/tmp/generated/images
|
|
|
|
## Specify a default upload limit in MB (whisper)
|
|
# LOCALAI_UPLOAD_LIMIT=15
|
|
|
|
## List of external GRPC backends (note on the container image this variable is already set to use extra backends available in extra/)
|
|
# LOCALAI_EXTERNAL_GRPC_BACKENDS=my-backend:127.0.0.1:9000,my-backend2:/usr/bin/backend.py
|
|
|
|
### Advanced settings ###
|
|
### Those are not really used by LocalAI, but from components in the stack ###
|
|
##
|
|
### Preload libraries
|
|
# LD_PRELOAD=
|
|
|
|
### Huggingface cache for models
|
|
# HUGGINGFACE_HUB_CACHE=/usr/local/huggingface
|
|
|
|
### Python backends GRPC max workers
|
|
### Default number of workers for GRPC Python backends.
|
|
### This actually controls wether a backend can process multiple requests or not.
|
|
# PYTHON_GRPC_MAX_WORKERS=1
|
|
|
|
### Define the number of parallel LLAMA.cpp workers (Defaults to 1)
|
|
# LLAMACPP_PARALLEL=1
|
|
|
|
### Enable to run parallel requests
|
|
# LOCALAI_PARALLEL_REQUESTS=true
|
|
|
|
### Watchdog settings
|
|
###
|
|
# Enables watchdog to kill backends that are inactive for too much time
|
|
# LOCALAI_WATCHDOG_IDLE=true
|
|
#
|
|
# Time in duration format (e.g. 1h30m) after which a backend is considered idle
|
|
# LOCALAI_WATCHDOG_IDLE_TIMEOUT=5m
|
|
#
|
|
# Enables watchdog to kill backends that are busy for too much time
|
|
# LOCALAI_WATCHDOG_BUSY=true
|
|
#
|
|
# Time in duration format (e.g. 1h30m) after which a backend is considered busy
|
|
# LOCALAI_WATCHDOG_BUSY_TIMEOUT=5m |