consolidating my llm related repos. big cleanup coming.
This commit is contained in:
@@ -0,0 +1,64 @@
|
||||
# Selecting a Local Large Language Model
|
||||
|
||||
## Step 1: Determine Your Requirements
|
||||
|
||||
1. **Purpose**:
|
||||
- What will you use the model for? (e.g., general-purpose assistant, code completion, specific tasks)
|
||||
|
||||
All of the above.
|
||||
|
||||
2. **Hardware Specifications**:
|
||||
- **GPU**: Does your laptop have a dedicated GPU (e.g., NVIDIA with CUDA support)?
|
||||
|
||||
Yes it does. It's an older generation NVIDIA GPU.
|
||||
|
||||
- **RAM**: How much system memory is available?
|
||||
|
||||
32 gigs of ram
|
||||
|
||||
- **Disk Space**: How much storage can you allocate for the model and its dependencies?
|
||||
|
||||
1 terabyte or more
|
||||
|
||||
3. **Ease of Use**:
|
||||
- Are you comfortable setting up frameworks like Docker or Python environments?
|
||||
|
||||
Yes. Please keep everything in docker including python.
|
||||
|
||||
|
||||
4. **Language Model Size**:
|
||||
- Do you need a lightweight model (<10 GB) for responsiveness or a large model for complex tasks?
|
||||
|
||||
How about a mixture and I can switch them based on the task?
|
||||
|
||||
## Step 2: Evaluate Available Models (Optional for Editing)
|
||||
|
||||
Here are some options to consider:
|
||||
- **GPT-Based Models**: GPT-J, GPT-NeoX
|
||||
- **LLaMA (Meta)**: LLaMA 2 (7B, 13B, 70B)
|
||||
- **Falcon**: Falcon 7B / 40B
|
||||
- **Mistral**: Lightweight and efficient 7B
|
||||
- **Vicuna**: Fine-tuned for conversational tasks
|
||||
|
||||
I have ollama installed and I've downloaded a few models. Here is the ollama models I have so far:
|
||||
|
||||
codellama:7b-code 8df0a30bb1e6 3.8 GB 44 hours ago
|
||||
codellama:7b-instruct 8fdf8f752f6e 3.8 GB 44 hours ago
|
||||
maryasov/llama3.1-cline:latest 2ecd12227dcf 4.7 GB 45 hours ago
|
||||
hhao/qwen2.5-coder-tools:latest 1fbf62f22cd4 4.7 GB 45 hours ago
|
||||
deepseek-coder:6.7b-base 585a5cb3b219 3.8 GB 3 weeks ago
|
||||
qwen2.5-coder:7b-base bd8755145f1c 4.7 GB 3 weeks ago
|
||||
starcoder:7b 53fdbc3a2006 4.3 GB 3 weeks ago
|
||||
deepseek-coder:base 3b417b786925 776 MB 3 weeks agoo
|
||||
|
||||
## Step 3: Tools for Running Models Locally
|
||||
|
||||
1. Text-Generation-WebUI: A user-friendly web interface for loading models.
|
||||
2. Hugging Face Transformers: A Python-based framework for managing models.
|
||||
3. GGML / GPTQ (Quantized Models): Lightweight alternatives with reduced precision.
|
||||
|
||||
Ok. I have both my local laptop and a big beefy server running Cloudron. I have Open-WebUI deployed in Cloudron. I would like to use both the local laptop and my remote server.
|
||||
|
||||
---
|
||||
|
||||
Edit this document with your answers or add any additional details you'd like me to consider. Upload the updated file back, and I'll guide you further!
|
Reference in New Issue
Block a user