lollms/README.md
2023-11-29 15:36:39 +01:00

304 lines
11 KiB
Markdown

# Lord of Large Language Models (LoLLMs)
<div align="center">
<img src="https://github.com/ParisNeo/lollms/blob/main/lollms/assets/logo.png" alt="Logo" width="200" height="200">
</div>
![GitHub license](https://img.shields.io/github/license/ParisNeo/lollms)
![GitHub issues](https://img.shields.io/github/issues/ParisNeo/lollms)
![GitHub stars](https://img.shields.io/github/stars/ParisNeo/lollms)
![GitHub forks](https://img.shields.io/github/forks/ParisNeo/lollms)
[![Discord](https://img.shields.io/discord/1092918764925882418?color=7289da&label=Discord&logo=discord&logoColor=ffffff)](https://discord.gg/4rR282WJb6)
[![Follow me on Twitter](https://img.shields.io/twitter/follow/SpaceNerduino?style=social)](https://twitter.com/SpaceNerduino)
[![Follow Me on YouTube](https://img.shields.io/badge/Follow%20Me%20on-YouTube-red?style=flat&logo=youtube)](https://www.youtube.com/user/Parisneo)
[![Downloads](https://static.pepy.tech/badge/lollms)](https://pepy.tech/project/lollms)
[![Downloads](https://static.pepy.tech/badge/lollms/month)](https://pepy.tech/project/lollms)
[![Downloads](https://static.pepy.tech/badge/lollms/week)](https://pepy.tech/project/lollms)
Lord of Large Language Models (LoLLMs) Server is a text generation server based on large language models. It provides a Flask-based API for generating text using various pre-trained language models. This server is designed to be easy to install and use, allowing developers to integrate powerful text generation capabilities into their applications.
## Features
- Fully integrated library with access to bindings, personalities and helper tools.
- Generate text using large language models.
- Supports multiple personalities for generating text with different styles and tones.
- Real-time text generation with WebSocket-based communication.
- RESTful API for listing personalities and adding new personalities.
- Easy integration with various applications and frameworks.
- Possibility to send files to personalities
- Possibility to run on multiple nodes and provide a generation service to many outputs at once.
- Data stays local even in the remote version. Only generations are sent to the host node. The logs, data and discussion history are kept in your local disucssion folder.
## Installation
You can install LoLLMs using pip, the Python package manager. Open your terminal or command prompt and run the following command:
```bash
pip install --upgrade lollms
```
Or if you want to get the latest version from the git:
```bash
pip install --upgrade git+https://github.com/ParisNeo/lollms.git
```
## GPU support
If you want to use cuda. Either install it directly or use conda to install everything:
```bash
conda create --name lollms python=3.10
```
Activate the environment
```bash
conda activate lollms
```
Install cudatoolkit
```bash
conda install -c anaconda cudatoolkit
```
Install lollms
```bash
pip install --upgrade lollms
```
Now you are ready.
To simply configure your environment run the settings app:
```bash
lollms-settings
```
The tool is intuitive and will guide you through configuration process.
The first time you will be prompted to select a binding.
![image](https://github.com/ParisNeo/lollms/assets/827993/2d7f58fe-089d-4d3e-a21a-0609f8e27969)
Once the binding is selected, you have to install at least a model. You have two options:
1- install from internet. Just give the link to a model on hugging face. For example. if you select the default llamacpp python bindings (7), you can install this model:
```bash
https://huggingface.co/TheBloke/airoboros-7b-gpt4-GGML/resolve/main/airoboros-7b-gpt4.ggmlv3.q4_1.bin
```
2- install from local drive. Just give the path to a model on your pc. The model will not be copied. We only create a reference to the model. This is useful if you use multiple clients so that you can mutualize your models with other tools.
Now you are ready to use the server.
## Library example
Here is the smallest possible example that allows you to use the full potential of the tool with nearly no code
```python
from lollms.console import Conversation
cv = Conversation(None)
cv.start_conversation()
```
Now you can reimplement the start_conversation method to do the things you want:
```python
from lollms.console import Conversation
class MyConversation(Conversation):
def __init__(self, cfg=None):
super().__init__(cfg, show_welcome_message=False)
def start_conversation(self):
prompt = "Once apon a time"
def callback(text, type=None):
print(text, end="", flush=True)
return True
print(prompt, end="", flush=True)
output = self.safe_generate(prompt, callback=callback)
if __name__ == '__main__':
cv = MyConversation()
cv.start_conversation()
```
Or if you want here is a conversation tool written in few lines
```python
from lollms.console import Conversation
class MyConversation(Conversation):
def __init__(self, cfg=None):
super().__init__(cfg, show_welcome_message=False)
def start_conversation(self):
full_discussion=""
while True:
prompt = input("You: ")
if prompt=="exit":
return
if prompt=="menu":
self.menu.main_menu()
full_discussion += self.personality.user_message_prefix+prompt+self.personality.link_text
full_discussion += self.personality.ai_message_prefix
def callback(text, type=None):
print(text, end="", flush=True)
return True
print(self.personality.name+": ",end="",flush=True)
output = self.safe_generate(full_discussion, callback=callback)
full_discussion += output.strip()+self.personality.link_text
print()
if __name__ == '__main__':
cv = MyConversation()
cv.start_conversation()
```
Here we use the safe_generate method that does all the cropping for you ,so you can chat forever and will never run out of context.
## Socket IO Server Usage
Once installed, you can start the LoLLMs Server using the `lollms-server` command followed by the desired parameters.
```
lollms-server --host <host> --port <port> --config <config_file> --bindings_path <bindings_path> --personalities_path <personalities_path> --models_path <models_path> --binding_name <binding_name> --model_name <model_name> --personality_full_name <personality_full_name>
```
### Parameters
- `--host`: The hostname or IP address to bind the server (default: localhost).
- `--port`: The port number to run the server (default: 9600).
- `--config`: Path to the configuration file (default: None).
- `--bindings_path`: The path to the Bindings folder (default: "./bindings_zoo").
- `--personalities_path`: The path to the personalities folder (default: "./personalities_zoo").
- `--models_path`: The path to the models folder (default: "./models").
- `--binding_name`: The default binding to be used (default: "llama_cpp_official").
- `--model_name`: The default model name (default: "Manticore-13B.ggmlv3.q4_0.bin").
- `--personality_full_name`: The full name of the default personality (default: "personality").
### Examples
Start the server with default settings:
```
lollms-server
```
Start the server on a specific host and port:
```
lollms-server --host 0.0.0.0 --port 5000
```
## API Endpoints
### WebSocket Events
- `connect`: Triggered when a client connects to the server.
- `disconnect`: Triggered when a client disconnects from the server.
- `list_personalities`: List all available personalities.
- `add_personality`: Add a new personality to the server.
- `generate_text`: Generate text based on the provided prompt and selected personality.
### RESTful API
- `GET /personalities`: List all available personalities.
- `POST /personalities`: Add a new personality to the server.
Sure! Here are examples of how to communicate with the LoLLMs Server using JavaScript and Python.
### JavaScript Example
```javascript
// Establish a WebSocket connection with the server
const socket = io.connect('http://localhost:9600');
// Event: When connected to the server
socket.on('connect', () => {
console.log('Connected to the server');
// Request the list of available personalities
socket.emit('list_personalities');
});
// Event: Receive the list of personalities from the server
socket.on('personalities_list', (data) => {
const personalities = data.personalities;
console.log('Available Personalities:', personalities);
// Select a personality and send a text generation request
const selectedPersonality = personalities[0];
const prompt = 'Once upon a time...';
socket.emit('generate_text', { personality: selectedPersonality, prompt: prompt });
});
// Event: Receive the generated text from the server
socket.on('text_generated', (data) => {
const generatedText = data.text;
console.log('Generated Text:', generatedText);
// Do something with the generated text
});
// Event: When disconnected from the server
socket.on('disconnect', () => {
console.log('Disconnected from the server');
});
```
### Python Example
```python
import socketio
# Create a SocketIO client
sio = socketio.Client()
# Event: When connected to the server
@sio.on('connect')
def on_connect():
print('Connected to the server')
# Request the list of available personalities
sio.emit('list_personalities')
# Event: Receive the list of personalities from the server
@sio.on('personalities_list')
def on_personalities_list(data):
personalities = data['personalities']
print('Available Personalities:', personalities)
# Select a personality and send a text generation request
selected_personality = personalities[0]
prompt = 'Once upon a time...'
sio.emit('generate_text', {'personality': selected_personality, 'prompt': prompt})
# Event: Receive the generated text from the server
@sio.on('text_generated')
def on_text_generated(data):
generated_text = data['text']
print('Generated Text:', generated_text)
# Do something with the generated text
# Event: When disconnected from the server
@sio.on('disconnect')
def on_disconnect():
print('Disconnected from the server')
# Connect to the server
sio.connect('http://localhost:9600')
# Keep the client running
sio.wait()
```
Make sure to have the necessary dependencies installed for the JavaScript and Python examples. For JavaScript, you need the `socket.io-client` package, and for Python, you need the `python-socketio` package.
## Contributing
Contributions to the LoLLMs Server project are welcome and appreciated. If you would like to contribute, please follow the guidelines outlined in the [CONTRIBUTING.md](https://github.com/ParisNeo/lollms/blob/main/CONTRIBUTING.md) file.
## License
LoLLMs Server is licensed under the Apache 2.0 License. See the [LICENSE](https://github.com/ParisNeo/lollms/blob/main/LICENSE) file for more information.
## Repository
The source code for LoLLMs Server can be found on GitHub