mirror of
https://github.com/mudler/LocalAI.git
synced 2024-12-20 21:23:10 +00:00
docs(transformers): add docs section about transformers (#1841)
This commit is contained in:
parent
db199f61da
commit
ae2e4fc2fe
@ -272,3 +272,56 @@ curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d
|
|||||||
"temperature": 0.1, "top_p": 0.1
|
"temperature": 0.1, "top_p": 0.1
|
||||||
}'
|
}'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Transformers
|
||||||
|
|
||||||
|
[Transformers](https://huggingface.co/docs/transformers/index) is a State-of-the-art Machine Learning library for PyTorch, TensorFlow, and JAX.
|
||||||
|
|
||||||
|
LocalAI has a built-in integration with Transformers, and it can be used to run models.
|
||||||
|
|
||||||
|
This is an extra backend - in the container images (the `extra` images already contains python dependencies for Transformers) is already available and there is nothing to do for the setup.
|
||||||
|
|
||||||
|
#### Setup
|
||||||
|
|
||||||
|
Create a YAML file for the model you want to use with `transformers`.
|
||||||
|
|
||||||
|
To setup a model, you need to just specify the model name in the YAML config file:
|
||||||
|
```yaml
|
||||||
|
name: transformers
|
||||||
|
backend: transformers
|
||||||
|
parameters:
|
||||||
|
model: "facebook/opt-125m"
|
||||||
|
type: AutoModelForCausalLM
|
||||||
|
quantization: bnb_4bit # One of: bnb_8bit, bnb_4bit, xpu_4bit (optional)
|
||||||
|
```
|
||||||
|
|
||||||
|
The backend will automatically download the required files in order to run the model.
|
||||||
|
|
||||||
|
#### Parameters
|
||||||
|
|
||||||
|
##### Type
|
||||||
|
|
||||||
|
| Type | Description |
|
||||||
|
| --- | --- |
|
||||||
|
| `AutoModelForCausalLM` | `AutoModelForCausalLM` is a model that can be used to generate sequences. |
|
||||||
|
| N/A | Defaults to `AutoModel` |
|
||||||
|
|
||||||
|
|
||||||
|
##### Quantization
|
||||||
|
|
||||||
|
| Quantization | Description |
|
||||||
|
| --- | --- |
|
||||||
|
| `bnb_8bit` | 8-bit quantization |
|
||||||
|
| `bnb_4bit` | 4-bit quantization |
|
||||||
|
| `xpu_4bit` | 4-bit quantization for Intel XPUs |
|
||||||
|
|
||||||
|
#### Usage
|
||||||
|
|
||||||
|
Use the `completions` endpoint by specifying the `transformers` model:
|
||||||
|
```
|
||||||
|
curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
|
||||||
|
"model": "transformers",
|
||||||
|
"prompt": "Hello, my name is",
|
||||||
|
"temperature": 0.1, "top_p": 0.1
|
||||||
|
}'
|
||||||
|
```
|
Loading…
Reference in New Issue
Block a user