docs(transformers): add docs section about transformers

2024-12-24 23:06:42 +00:00 · 2024-03-15 18:02:15 +01:00 · 2024-03-15 18:02:15 +01:00 · 5b8d6a31e2
commit 5b8d6a31e2
parent f0752be4aa
1 changed files with 53 additions and 0 deletions
--- a/docs/content/docs/features/text-generation.md
+++ b/docs/content/docs/features/text-generation.md
@ -272,3 +272,56 @@ curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d
   "temperature": 0.1, "top_p": 0.1
 }'
 ```
 ### Transformers
 [Transformers](https://huggingface.co/docs/transformers/index) is a State-of-the-art Machine Learning library for PyTorch, TensorFlow, and JAX.
 LocalAI has a built-in integration with Transformers, and it can be used to run models.
 This is an extra backend - in the container images (the `extra` images already contains python dependencies for Transformers) is already available and there is nothing to do for the setup.
 #### Setup
 Create a YAML file for the model you want to use with `transformers`.
 To setup a model, you need to just specify the model name in the YAML config file:
 ```yaml
 name: transformers
 backend: transformers
 parameters:
    model: "facebook/opt-125m"
 type: AutoModelForCausalLM
 quantization: bnb_4bit # One of: bnb_8bit, bnb_4bit, xpu_4bit (optional)
 ```
 The backend will automatically download the required files in order to run the model.
 #### Parameters
 ##### Type
 | Type | Description |
 | --- | --- |
 | `AutoModelForCausalLM` | `AutoModelForCausalLM` is a model that can be used to generate sequences. |
 | N/A | Defaults to `AutoModel` |
 ##### Quantization
 | Quantization | Description |
 | --- | --- |
 | `bnb_8bit` | 8-bit quantization |
 | `bnb_4bit` | 4-bit quantization |
 | `xpu_4bit` | 4-bit quantization for Intel XPUs |
 #### Usage
 Use the `completions` endpoint by specifying the `transformers` model:
 ```
 curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{   
   "model": "transformers",
   "prompt": "Hello, my name is",
   "temperature": 0.1, "top_p": 0.1
 }'
 ```