mirror of
https://github.com/mudler/LocalAI.git
synced 2024-12-28 08:28:51 +00:00
c5c77d2b0d
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
1.2 KiB
1.2 KiB
+++ disableToc = false title = "🦙 AutoGPTQ" weight = 3 +++
AutoGPTQ is an easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Prerequisites
This is an extra backend - in the container images is already available and there is nothing to do for the setup.
If you are building LocalAI locally, you need to install AutoGPTQ manually.
Model setup
The models are automatically downloaded from huggingface
if not present the first time. It is possible to define models via YAML
config file, or just by querying the endpoint with the huggingface
repository model name. For example, create a YAML
config file in models/
:
name: orca
backend: autogptq
model_base_name: "orca_mini_v2_13b-GPTQ-4bit-128g.no-act.order"
parameters:
model: "TheBloke/orca_mini_v2_13b-GPTQ"
# ...
Test with:
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "orca",
"messages": [{"role": "user", "content": "How are you?"}],
"temperature": 0.1
}'