mirror of
https://github.com/mudler/LocalAI.git
synced 2025-01-18 10:46:46 +00:00
Site Clean up - How to Clean up (#1342)
* Create easy-request.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-request.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-request.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-request.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-request.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-request.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-request-curl.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-request-openai-v0.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-request-openai-v1.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-request.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Delete docs/content/howtos/easy-request-openai-v1.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Delete docs/content/howtos/easy-request-openai-v0.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Delete docs/content/howtos/easy-request-curl.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update and rename easy-model-import-downloaded.md to easy-model.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update _index.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-cpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-gpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-gpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-cpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Delete docs/content/howtos/autogen-setup.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update _index.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Delete docs/content/howtos/easy-request-autogen.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-model.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update _index.en.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update _index.en.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update _index.en.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update _index.en.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update _index.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> --------- Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>
This commit is contained in:
parent
2b2007ae9e
commit
6b312a8522
@ -14,7 +14,7 @@ Here are answers to some of the most common questions.
|
||||
|
||||
<details>
|
||||
|
||||
Most ggml-based models should work, but newer models may require additions to the API. If a model doesn't work, please feel free to open up issues. However, be cautious about downloading models from the internet and directly onto your machine, as there may be security vulnerabilities in lama.cpp or ggml that could be maliciously exploited. Some models can be found on Hugging Face: https://huggingface.co/models?search=ggml, or models from gpt4all are compatible too: https://github.com/nomic-ai/gpt4all.
|
||||
Most gguf-based models should work, but newer models may require additions to the API. If a model doesn't work, please feel free to open up issues. However, be cautious about downloading models from the internet and directly onto your machine, as there may be security vulnerabilities in lama.cpp or ggml that could be maliciously exploited. Some models can be found on Hugging Face: https://huggingface.co/models?search=gguf, or models from gpt4all are compatible too: https://github.com/nomic-ai/gpt4all.
|
||||
|
||||
</details>
|
||||
|
||||
|
@ -26,7 +26,7 @@ To run with GPU Accelleration, see [GPU acceleration]({{%relref "features/gpu-ac
|
||||
mkdir models
|
||||
|
||||
# copy your models to it
|
||||
cp your-model.bin models/
|
||||
cp your-model.gguf models/
|
||||
|
||||
# run the LocalAI container
|
||||
docker run -p 8080:8080 -v $PWD/models:/models -ti --rm quay.io/go-skynet/local-ai:latest --models-path /models --context-size 700 --threads 4
|
||||
@ -43,7 +43,7 @@ docker run -p 8080:8080 -v $PWD/models:/models -ti --rm quay.io/go-skynet/local-
|
||||
|
||||
# Try the endpoint with curl
|
||||
curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
|
||||
"model": "your-model.bin",
|
||||
"model": "your-model.gguf",
|
||||
"prompt": "A long time ago in a galaxy far, far away",
|
||||
"temperature": 0.7
|
||||
}'
|
||||
@ -67,7 +67,7 @@ cd LocalAI
|
||||
# git checkout -b build <TAG>
|
||||
|
||||
# copy your models to models/
|
||||
cp your-model.bin models/
|
||||
cp your-model.gguf models/
|
||||
|
||||
# (optional) Edit the .env file to set things like context size and threads
|
||||
# vim .env
|
||||
@ -79,10 +79,10 @@ docker compose up -d --pull always
|
||||
|
||||
# Now API is accessible at localhost:8080
|
||||
curl http://localhost:8080/v1/models
|
||||
# {"object":"list","data":[{"id":"your-model.bin","object":"model"}]}
|
||||
# {"object":"list","data":[{"id":"your-model.gguf","object":"model"}]}
|
||||
|
||||
curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
|
||||
"model": "your-model.bin",
|
||||
"model": "your-model.gguf",
|
||||
"prompt": "A long time ago in a galaxy far, far away",
|
||||
"temperature": 0.7
|
||||
}'
|
||||
|
@ -10,14 +10,10 @@ This section includes LocalAI end-to-end examples, tutorial and how-tos curated
|
||||
|
||||
- [Setup LocalAI with Docker on CPU]({{%relref "howtos/easy-setup-docker-cpu" %}})
|
||||
- [Setup LocalAI with Docker With CUDA]({{%relref "howtos/easy-setup-docker-gpu" %}})
|
||||
- [Seting up a Model]({{%relref "howtos/easy-model-import-downloaded" %}})
|
||||
- [Making requests via Autogen]({{%relref "howtos/easy-request-autogen" %}})
|
||||
- [Making requests via OpenAi API V0]({{%relref "howtos/easy-request-openai-v0" %}})
|
||||
- [Making requests via OpenAi API V1]({{%relref "howtos/easy-request-openai-v1" %}})
|
||||
- [Making requests via Curl]({{%relref "howtos/easy-request-curl" %}})
|
||||
- [Seting up a Model]({{%relref "howtos/easy-model" %}})
|
||||
- [Making requests to LocalAI]({{%relref "howtos/easy-request" %}})
|
||||
|
||||
## Programs and Demos
|
||||
|
||||
This section includes other programs and how to setup, install, and use of LocalAI.
|
||||
- [Python LocalAI Demo]({{%relref "howtos/easy-setup-full" %}}) - [lunamidori5](https://github.com/lunamidori5)
|
||||
- [Autogen]({{%relref "howtos/autogen-setup" %}}) - [lunamidori5](https://github.com/lunamidori5)
|
||||
|
@ -1,91 +0,0 @@
|
||||
|
||||
+++
|
||||
disableToc = false
|
||||
title = "Easy Demo - AutoGen"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
This is just a short demo of setting up ``LocalAI`` with Autogen, this is based on you already having a model setup.
|
||||
|
||||
```python
|
||||
import os
|
||||
import openai
|
||||
import autogen
|
||||
|
||||
openai.api_key = "sx-xxx"
|
||||
OPENAI_API_KEY = "sx-xxx"
|
||||
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
|
||||
|
||||
config_list_json = [
|
||||
{
|
||||
"model": "gpt-3.5-turbo",
|
||||
"api_base": "http://[YOURLOCALAIIPHERE]:8080/v1",
|
||||
"api_type": "open_ai",
|
||||
"api_key": "NULL",
|
||||
}
|
||||
]
|
||||
|
||||
print("models to use: ", [config_list_json[i]["model"] for i in range(len(config_list_json))])
|
||||
|
||||
llm_config = {"config_list": config_list_json, "seed": 42}
|
||||
user_proxy = autogen.UserProxyAgent(
|
||||
name="Admin",
|
||||
system_message="A human admin. Interact with the planner to discuss the plan. Plan execution needs to be approved by this admin.",
|
||||
code_execution_config={
|
||||
"work_dir": "coding",
|
||||
"last_n_messages": 8,
|
||||
"use_docker": "python:3",
|
||||
},
|
||||
human_input_mode="ALWAYS",
|
||||
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
|
||||
)
|
||||
engineer = autogen.AssistantAgent(
|
||||
name="Coder",
|
||||
llm_config=llm_config,
|
||||
)
|
||||
scientist = autogen.AssistantAgent(
|
||||
name="Scientist",
|
||||
llm_config=llm_config,
|
||||
system_message="""Scientist. You follow an approved plan. You are able to categorize papers after seeing their abstracts printed. You don't write code."""
|
||||
)
|
||||
planner = autogen.AssistantAgent(
|
||||
name="Planner",
|
||||
system_message='''Planner. Suggest a plan. Revise the plan based on feedback from admin and critic, until admin approval.
|
||||
The plan may involve an engineer who can write code and a scientist who doesn't write code.
|
||||
Explain the plan first. Be clear which step is performed by an engineer, and which step is performed by a scientist.
|
||||
''',
|
||||
llm_config=llm_config,
|
||||
)
|
||||
executor = autogen.UserProxyAgent(
|
||||
name="Executor",
|
||||
system_message="Executor. Execute the code written by the engineer and report the result.",
|
||||
human_input_mode="NEVER",
|
||||
code_execution_config={
|
||||
"work_dir": "coding",
|
||||
"last_n_messages": 8,
|
||||
"use_docker": "python:3",
|
||||
}
|
||||
)
|
||||
critic = autogen.AssistantAgent(
|
||||
name="Critic",
|
||||
system_message="Critic. Double check plan, claims, code from other agents and provide feedback. Check whether the plan includes adding verifiable info such as source URL.",
|
||||
llm_config=llm_config,
|
||||
)
|
||||
groupchat = autogen.GroupChat(agents=[user_proxy, engineer, scientist, planner, executor, critic], messages=[], max_round=999)
|
||||
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)
|
||||
|
||||
|
||||
#autogen.ChatCompletion.start_logging()
|
||||
|
||||
#text_input = input("Please enter request: ")
|
||||
text_input = ("Change this to a task you would like the group chat to do or comment this out and uncomment the other line!")
|
||||
|
||||
#Uncomment one of these two chats based on what you would like to do
|
||||
|
||||
#user_proxy.initiate_chat(engineer, message=str(text_input))
|
||||
#For a one on one chat use this one ^
|
||||
|
||||
#user_proxy.initiate_chat(manager, message=str(text_input))
|
||||
#To setup a group chat use this one ^
|
||||
```
|
||||
|
@ -59,9 +59,6 @@ What this does is tell ``LocalAI`` how to load the model. Then we are going to *
|
||||
name: lunademo
|
||||
parameters:
|
||||
model: luna-ai-llama2-uncensored.Q4_K_M.gguf
|
||||
temperature: 0.2
|
||||
top_k: 40
|
||||
top_p: 0.65
|
||||
```
|
||||
|
||||
Now that we have the model set up, there a few things we should add to the yaml file to make it run better, for this model it uses the following roles.
|
||||
@ -100,9 +97,6 @@ context_size: 2000
|
||||
name: lunademo
|
||||
parameters:
|
||||
model: luna-ai-llama2-uncensored.Q4_K_M.gguf
|
||||
temperature: 0.2
|
||||
top_k: 40
|
||||
top_p: 0.65
|
||||
roles:
|
||||
assistant: 'ASSISTANT:'
|
||||
system: 'SYSTEM:'
|
||||
@ -112,7 +106,7 @@ template:
|
||||
completion: lunademo-completion
|
||||
```
|
||||
|
||||
Now that we got that setup, lets test it out but sending a request by using [Curl]({{%relref "easy-request-curl" %}}) Or use the [OpenAI Python API]({{%relref "easy-request-openai-v1" %}})!
|
||||
Now that we got that setup, lets test it out but sending a [request]({{%relref "easy-request" %}}) to Localai!
|
||||
|
||||
## Adv Stuff
|
||||
Alright now that we have learned how to set up our own models, here is how to use the gallery to do alot of this for us. This command will download and set up (mostly, we will **always** need to edit our yaml file to fit our computer / hardware)
|
@ -1 +0,0 @@
|
||||
|
@ -1,35 +0,0 @@
|
||||
|
||||
+++
|
||||
disableToc = false
|
||||
title = "Easy Request - Curl"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
Now we can make a curl request!
|
||||
|
||||
Curl Chat API -
|
||||
|
||||
```bash
|
||||
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
|
||||
"model": "lunademo",
|
||||
"messages": [{"role": "user", "content": "How are you?"}],
|
||||
"temperature": 0.9
|
||||
}'
|
||||
```
|
||||
|
||||
Curl Completion API -
|
||||
|
||||
```bash
|
||||
curl --request POST \
|
||||
--url http://localhost:8080/v1/completions \
|
||||
--header 'Content-Type: application/json' \
|
||||
--data '{
|
||||
"model": "lunademo",
|
||||
"prompt": "function downloadFile(string url, string outputPath) {",
|
||||
"max_tokens": 256,
|
||||
"temperature": 0.5
|
||||
}'
|
||||
```
|
||||
|
||||
See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info!
|
||||
Have fun using LocalAI!
|
@ -1,50 +0,0 @@
|
||||
|
||||
+++
|
||||
disableToc = false
|
||||
title = "Easy Request - Openai V0"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
This is for Python, ``OpenAI``=``0.28.1``, if you are on ``OpenAI``=>``V1`` please use this [How to]({{%relref "howtos/easy-request-openai-v1" %}})
|
||||
|
||||
OpenAI Chat API Python -
|
||||
|
||||
```python
|
||||
import os
|
||||
import openai
|
||||
openai.api_base = "http://localhost:8080/v1"
|
||||
openai.api_key = "sx-xxx"
|
||||
OPENAI_API_KEY = "sx-xxx"
|
||||
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
|
||||
|
||||
completion = openai.ChatCompletion.create(
|
||||
model="lunademo",
|
||||
messages=[
|
||||
{"role": "system", "content": "You are a helpful assistant."},
|
||||
{"role": "user", "content": "How are you?"}
|
||||
]
|
||||
)
|
||||
|
||||
print(completion.choices[0].message.content)
|
||||
```
|
||||
|
||||
OpenAI Completion API Python -
|
||||
|
||||
```python
|
||||
import os
|
||||
import openai
|
||||
openai.api_base = "http://localhost:8080/v1"
|
||||
openai.api_key = "sx-xxx"
|
||||
OPENAI_API_KEY = "sx-xxx"
|
||||
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
|
||||
|
||||
completion = openai.Completion.create(
|
||||
model="lunademo",
|
||||
prompt="function downloadFile(string url, string outputPath) ",
|
||||
max_tokens=256,
|
||||
temperature=0.5)
|
||||
|
||||
print(completion.choices[0].text)
|
||||
```
|
||||
See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info!
|
||||
Have fun using LocalAI!
|
@ -1,28 +0,0 @@
|
||||
|
||||
+++
|
||||
disableToc = false
|
||||
title = "Easy Request - Openai V1"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
This is for Python, ``OpenAI``=>``V1``, if you are on ``OpenAI``<``V1`` please use this [How to]({{%relref "howtos/easy-request-openai-v0" %}})
|
||||
|
||||
OpenAI Chat API Python -
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI(base_url="http://localhost:8080/v1", api_key="sk-xxx")
|
||||
|
||||
messages = [
|
||||
{"role": "system", "content": "You are LocalAI, a helpful, but really confused ai, you will only reply with confused emotes"},
|
||||
{"role": "user", "content": "Hello How are you today LocalAI"}
|
||||
]
|
||||
completion = client.chat.completions.create(
|
||||
model="lunademo",
|
||||
messages=messages,
|
||||
)
|
||||
|
||||
print(completion.choices[0].message)
|
||||
```
|
||||
See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info!
|
||||
Have fun using LocalAI!
|
85
docs/content/howtos/easy-request.md
Normal file
85
docs/content/howtos/easy-request.md
Normal file
@ -0,0 +1,85 @@
|
||||
|
||||
+++
|
||||
disableToc = false
|
||||
title = "Easy Request - All"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
## Curl Request
|
||||
|
||||
Curl Chat API -
|
||||
|
||||
```bash
|
||||
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
|
||||
"model": "lunademo",
|
||||
"messages": [{"role": "user", "content": "How are you?"}],
|
||||
"temperature": 0.9
|
||||
}'
|
||||
```
|
||||
|
||||
## Openai V1 - Recommended
|
||||
|
||||
This is for Python, ``OpenAI``=>``V1``
|
||||
|
||||
OpenAI Chat API Python -
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI(base_url="http://localhost:8080/v1", api_key="sk-xxx")
|
||||
|
||||
messages = [
|
||||
{"role": "system", "content": "You are LocalAI, a helpful, but really confused ai, you will only reply with confused emotes"},
|
||||
{"role": "user", "content": "Hello How are you today LocalAI"}
|
||||
]
|
||||
completion = client.chat.completions.create(
|
||||
model="lunademo",
|
||||
messages=messages,
|
||||
)
|
||||
|
||||
print(completion.choices[0].message)
|
||||
```
|
||||
See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info!
|
||||
|
||||
## Openai V0 - Not Recommended
|
||||
|
||||
This is for Python, ``OpenAI``=``0.28.1``
|
||||
|
||||
OpenAI Chat API Python -
|
||||
|
||||
```python
|
||||
import os
|
||||
import openai
|
||||
openai.api_base = "http://localhost:8080/v1"
|
||||
openai.api_key = "sx-xxx"
|
||||
OPENAI_API_KEY = "sx-xxx"
|
||||
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
|
||||
|
||||
completion = openai.ChatCompletion.create(
|
||||
model="lunademo",
|
||||
messages=[
|
||||
{"role": "system", "content": "You are LocalAI, a helpful, but really confused ai, you will only reply with confused emotes"},
|
||||
{"role": "user", "content": "How are you?"}
|
||||
]
|
||||
)
|
||||
|
||||
print(completion.choices[0].message.content)
|
||||
```
|
||||
|
||||
OpenAI Completion API Python -
|
||||
|
||||
```python
|
||||
import os
|
||||
import openai
|
||||
openai.api_base = "http://localhost:8080/v1"
|
||||
openai.api_key = "sx-xxx"
|
||||
OPENAI_API_KEY = "sx-xxx"
|
||||
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
|
||||
|
||||
completion = openai.Completion.create(
|
||||
model="lunademo",
|
||||
prompt="function downloadFile(string url, string outputPath) ",
|
||||
max_tokens=256,
|
||||
temperature=0.5)
|
||||
|
||||
print(completion.choices[0].text)
|
||||
```
|
@ -102,7 +102,8 @@ services:
|
||||
Make sure to save that in the root of the `LocalAI` folder. Then lets spin up the Docker run this in a `CMD` or `BASH`
|
||||
|
||||
```bash
|
||||
docker-compose up -d --pull always
|
||||
docker-compose up -d --pull always ##Windows
|
||||
docker compose up -d --pull always ##Linux
|
||||
```
|
||||
|
||||
|
||||
@ -128,4 +129,4 @@ Output will look like this:
|
||||
|
||||
![](https://cdn.discordapp.com/attachments/1116933141895053322/1134037542845566976/image.png)
|
||||
|
||||
Now that we got that setup, lets go setup a [model]({{%relref "easy-model-import-downloaded" %}})
|
||||
Now that we got that setup, lets go setup a [model]({{%relref "easy-model" %}})
|
||||
|
@ -117,7 +117,8 @@ services:
|
||||
Make sure to save that in the root of the `LocalAI` folder. Then lets spin up the Docker run this in a `CMD` or `BASH`
|
||||
|
||||
```bash
|
||||
docker-compose up -d --pull always
|
||||
docker-compose up -d --pull always ##Windows
|
||||
docker compose up -d --pull always ##Linux
|
||||
```
|
||||
|
||||
|
||||
@ -143,4 +144,4 @@ Output will look like this:
|
||||
|
||||
![](https://cdn.discordapp.com/attachments/1116933141895053322/1134037542845566976/image.png)
|
||||
|
||||
Now that we got that setup, lets go setup a [model]({{%relref "easy-model-import-downloaded" %}})
|
||||
Now that we got that setup, lets go setup a [model]({{%relref "easy-model" %}})
|
||||
|
@ -15,7 +15,7 @@ LocalAI will attempt to automatically load models which are not explicitly confi
|
||||
|
||||
### Hardware requirements
|
||||
|
||||
Depending on the model you are attempting to run might need more RAM or CPU resources. Check out also [here](https://github.com/ggerganov/llama.cpp#memorydisk-requirements) for `ggml` based backends. `rwkv` is less expensive on resources.
|
||||
Depending on the model you are attempting to run might need more RAM or CPU resources. Check out also [here](https://github.com/ggerganov/llama.cpp#memorydisk-requirements) for `gguf` based backends. `rwkv` is less expensive on resources.
|
||||
|
||||
### Model compatibility table
|
||||
|
||||
|
@ -25,7 +25,7 @@ GPT and text generation models might have a license which is not permissive for
|
||||
|
||||
## Useful Links and resources
|
||||
|
||||
- [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) - here you can find a list of the most performing models on the Open LLM benchmark. Keep in mind models compatible with LocalAI must be quantized in the `ggml` format.
|
||||
- [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) - here you can find a list of the most performing models on the Open LLM benchmark. Keep in mind models compatible with LocalAI must be quantized in the `gguf` format.
|
||||
|
||||
|
||||
## Model repositories
|
||||
|
Loading…
Reference in New Issue
Block a user