chore(model gallery): add qwen3-30b-a3b (#5269)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
Ettore Di Giacinto 2025-04-29 09:44:44 +02:00 committed by GitHub
parent e13dd5b09f
commit 7a011e60bd
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 84 additions and 0 deletions

View File

@ -1,4 +1,49 @@
---
- &qwen3
url: "github:mudler/LocalAI/gallery/qwen3.yaml@master"
name: "qwen3-30b-a3b"
urls:
- https://huggingface.co/Qwen/Qwen3-30B-A3B
- https://huggingface.co/bartowski/Qwen_Qwen3-30B-A3B-GGUF
icon: https://cdn-avatars.huggingface.co/v1/production/uploads/620760a26e3b7210c2ff1943/-s1gyJfvbE1RgO5iBeNOi.png
license: apache-2.0
description: |
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:
Uniquely support of seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within single model, ensuring optimal performance across various scenarios.
Significantly enhancement in its reasoning capabilities, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning.
Superior human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience.
Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks.
Support of 100+ languages and dialects with strong capabilities for multilingual instruction following and translation.
Qwen3-30B-A3B has the following features:
Type: Causal Language Models
Training Stage: Pretraining & Post-training
Number of Parameters: 30.5B in total and 3.3B activated
Number of Paramaters (Non-Embedding): 29.9B
Number of Layers: 48
Number of Attention Heads (GQA): 32 for Q and 4 for KV
Number of Experts: 128
Number of Activated Experts: 8
Context Length: 32,768 natively and 131,072 tokens with YaRN.
For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our blog, GitHub, and Documentation.
tags:
- llm
- gguf
- gpu
- cpu
- qwen
- qwen3
- thinking
- reasoning
overrides:
parameters:
model: Qwen_Qwen3-30B-A3B-Q4_K_M.gguf
files:
- filename: Qwen_Qwen3-30B-A3B-Q4_K_M.gguf
sha256: a015794bfb1d69cb03dbb86b185fb2b9b339f757df5f8f9dd9ebdab8f6ed5d32
uri: huggingface://bartowski/Qwen_Qwen3-30B-A3B-GGUF/Qwen_Qwen3-30B-A3B-Q4_K_M.gguf
- &gemma3
url: "github:mudler/LocalAI/gallery/gemma.yaml@master"
name: "gemma-3-27b-it"

39
gallery/qwen3.yaml Normal file
View File

@ -0,0 +1,39 @@
---
name: "qwen3"
config_file: |
mmap: true
template:
chat_message: |
<|im_start|>{{ .RoleName }}
{{ if .FunctionCall -}}
{{ else if eq .RoleName "tool" -}}
{{ end -}}
{{ if .Content -}}
{{.Content }}
{{ end -}}
{{ if .FunctionCall -}}
{{toJson .FunctionCall}}
{{ end -}}<|im_end|>
function: |
<|im_start|>system
You are a function calling AI model. You are provided with functions to execute. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools:
{{range .Functions}}
{'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
{{end}}
For each function call return a json object with function name and arguments
<|im_end|>
{{.Input -}}
<|im_start|>assistant
chat: |
{{.Input -}}
<|im_start|>assistant
completion: |
{{.Input}}
context_size: 8192
f16: true
stopwords:
- '<|im_end|>'
- '<dummy32000>'
- '</s>'
- '<|endoftext|>'