b584dcf18a
⬆️ Update ggerganov/llama.cpp ( #2316 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2024-05-15 22:20:37 +00:00
4c845fb47d
Update README.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-15 23:56:52 +02:00
07c0559d06
Update README.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-15 23:56:22 +02:00
beb598e4f9
feat(functions): mixed JSON BNF grammars ( #2328 )
...
feat(functions): support mixed JSON BNF grammar
This PR provides new options to control how functions are extracted from
the LLM, and also provides more control on how JSON grammars can be used
(also in conjunction).
New YAML settings introduced:
- `grammar_message`: when enabled, the generated grammar can also decide
to push strings and not only JSON objects. This allows the LLM to pick
to either respond freely or using JSON.
- `grammar_prefix`: Allows to prefix a string to the JSON grammar
definition.
- `replace_results`: Is a map that allows to replace strings in the LLM
result.
As an example, consider the following settings for Hermes-2-Pro-Mistral,
which allow extracting both JSON results coming from the model, and the
ones coming from the grammar:
```yaml
function:
# disable injecting the "answer" tool
disable_no_action: true
# This allows the grammar to also return messages
grammar_message: true
# Suffix to add to the grammar
grammar_prefix: '<tool_call>\n'
return_name_in_function_response: true
# Without grammar uncomment the lines below
# Warning: this is relying only on the capability of the
# LLM model to generate the correct function call.
# no_grammar: true
# json_regex_match: "(?s)<tool_call>(.*?)</tool_call>"
replace_results:
"<tool_call>": ""
"\'": "\""
```
Note: To disable entirely grammars usage in the example above, uncomment the
`no_grammar` and `json_regex_match`.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-15 20:03:18 +02:00
c89271b2e4
feat(llama.cpp): add distributed llama.cpp inferencing ( #2324 )
...
* feat(llama.cpp): support distributed llama.cpp
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat: let tweak how chat messages are merged together
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* refactor
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* Makefile: register to ALL_GRPC_BACKENDS
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* refactoring, allow disable auto-detection of backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* minor fixups
Signed-off-by: mudler <mudler@localai.io >
* feat: add cmd to start rpc-server from llama.cpp
Signed-off-by: mudler <mudler@localai.io >
* ci: add ccache
Signed-off-by: mudler <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
Signed-off-by: mudler <mudler@localai.io >
2024-05-15 01:17:02 +02:00
29909666c3
Update README.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-15 00:33:16 +02:00
566b5cf2ee
⬆️ Update ggerganov/whisper.cpp ( #2326 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2024-05-14 21:17:46 +00:00
a670318a9f
feat: auto select llama-cpp cuda runtime ( #2306 )
...
* auto select cpu variant
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* remove cuda target for now
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* fix metal
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* fix path
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* cuda
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* auto select cuda
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* update test
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* select CUDA backend only if present
Signed-off-by: mudler <mudler@localai.io >
* ci: keep cuda bin in path
Signed-off-by: mudler <mudler@localai.io >
* Makefile: make dist now builds also cuda
Signed-off-by: mudler <mudler@localai.io >
* Keep pushing fallback in case auto-flagset/nvidia fails
There could be other reasons for which the default binary may fail. For example we might have detected an Nvidia GPU,
however the user might not have the drivers/cuda libraries installed in the system, and so it would fail to start.
We keep the fallback of llama.cpp at the end of the llama.cpp backends to try to fallback loading in case things go wrong
Signed-off-by: mudler <mudler@localai.io >
* Do not build cuda on MacOS
Signed-off-by: mudler <mudler@localai.io >
* cleanup
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* Apply suggestions from code review
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
---------
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
Signed-off-by: mudler <mudler@localai.io >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
Co-authored-by: mudler <mudler@localai.io >
2024-05-14 19:40:18 +02:00
84e2407afa
feat(functions): allow to set JSON matcher ( #2319 )
...
Signed-off-by: mudler <mudler@localai.io >
2024-05-14 09:39:20 +02:00
c4186f13c3
feat(functions): support models with no grammar and no regex ( #2315 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-14 00:32:32 +02:00
4ac7956f68
⬆️ Update ggerganov/whisper.cpp ( #2317 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2024-05-13 22:25:14 +00:00
e49ea0123b
feat(llama.cpp): add flash_attention
and no_kv_offloading
( #2310 )
...
feat(llama.cpp): add flash_attn and no_kv_offload
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-13 19:07:51 +02:00
7123d07456
models(gallery): add orthocopter ( #2313 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-13 18:45:58 +02:00
2db22087ae
models(gallery): add lumimaidv2 ( #2312 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-13 18:44:44 +02:00
fa7b2aee9c
models(gallery): add Bunny-llama ( #2311 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-13 18:44:25 +02:00
4d70b6fb2d
models(gallery): add aura-llama-Abliterated ( #2309 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-13 18:44:10 +02:00
e2c3ffb09b
feat: auto select llama-cpp cpu variant ( #2305 )
...
* auto select cpu variant
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* remove cuda target for now
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* fix metal
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
* fix path
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
---------
Signed-off-by: Sertac Ozercan <sozercan@gmail.com >
2024-05-13 11:37:52 +02:00
b4cb22f444
⬆️ Update ggerganov/llama.cpp ( #2303 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2024-05-12 21:18:59 +00:00
5534b13903
feat(swagger): update swagger ( #2302 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2024-05-12 21:00:18 +00:00
5b79bd04a7
add setuptools for openvino ( #2301 )
2024-05-12 19:31:43 +00:00
9d8c705fd9
feat(ui): display number of available models for installation ( #2298 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-12 14:24:36 +02:00
310b2171be
models(gallery): add llama-3-refueled ( #2297 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-12 09:39:58 +02:00
98af0b5d85
models(gallery): add jsl-medllama-3-8b-v2.0 ( #2296 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-12 09:38:05 +02:00
ca14f95d2c
models(gallery): add l3-chaoticsoliloquy-v1.5-4x8b ( #2295 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-12 09:37:55 +02:00
1b69b338c0
docs: Update semantic-todo/README.md ( #2294 )
...
seperate -> separate
Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com >
2024-05-12 09:02:11 +02:00
88942e4761
fix: add missing openvino/optimum/etc libraries for Intel, fixes #2289 ( #2292 )
...
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-05-12 09:01:45 +02:00
efa32a2677
feat(grammar): support models with specific construct ( #2291 )
...
When enabling grammar with functions, it might be useful to
allow more flexibility to support models that are fine-tuned against returning
function calls of the form of { "name": "function_name", "arguments" {...} }
rather then { "function": "function_name", "arguments": {..} }.
This might call out to a more generic approach later on, but for the moment being we can easily support both
as we have just to specific different types.
If needed we can expand on this later on
Signed-off-by: mudler <mudler@localai.io >
2024-05-12 01:13:22 +02:00
dfc420706c
⬆️ Update ggerganov/llama.cpp ( #2290 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2024-05-11 21:16:34 +00:00
e2de8a88f7
feat: create bash library to handle install/run/test of python backends ( #2286 )
...
* feat: create bash library to handle install/run/test of python backends
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* chore: minor cleanup
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: remove incorrect LIMIT_TARGETS from parler-tts
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: update runUnitests to handle running tests from a custom test file
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* chore: document runUnittests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
---------
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-05-11 18:32:46 +02:00
7f4febd6c2
models(gallery): add Llama-3-8B-Instruct-abliterated ( #2288 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-11 10:10:57 +02:00
93e581dfd0
⬆️ Update ggerganov/llama.cpp ( #2285 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2024-05-10 21:09:22 +00:00
cf513efa78
Update openai-functions.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-10 17:09:51 +02:00
9e8b34427a
Update openai-functions.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-10 17:05:16 +02:00
88d0aa1e40
docs: update function docs
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-10 17:03:56 +02:00
9b09eb005f
build: do not specify a BUILD_ID by default ( #2284 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-10 16:01:55 +02:00
4db41b71f3
models(gallery): add aloe ( #2283 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2024-05-10 16:01:47 +02:00
28a421cb1d
feat: migrate python backends from conda to uv ( #2215 )
...
* feat: migrate diffusers backend from conda to uv
- replace conda with UV for diffusers install (prototype for all
extras backends)
- add ability to build docker with one/some/all extras backends
instead of all or nothing
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate autogtpq bark coqui from conda to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: convert exllama over to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate exllama2 to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate mamba to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate parler to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate petals to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: fix tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate rerankers to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate sentencetransformers to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: install uv for tests-linux
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: make sure file exists before installing on intel images
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate transformers backend to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate transformers-musicgen to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate vall-e-x to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: migrate vllm to uv
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add uv install to the rest of test-extra.yml
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: adjust file perms on all install/run/test scripts
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add missing acclerate dependencies
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add some more missing dependencies to python backends
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: parler tests venv py dir fix
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: correct filename for transformers-musicgen tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: adjust the pwd for valle tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: cleanup and optimization work for uv migration
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: add setuptools to requirements-install for mamba
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: more size optimization work
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* feat: make installs and tests more consistent, cleanup some deps
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: cleanup
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: mamba backend is cublas only
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
* fix: uncomment lines in makefile
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
---------
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com >
2024-05-10 15:08:08 +02:00
e6768097f4
⬆️ Update docs version mudler/LocalAI ( #2280 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2024-05-10 09:10:00 +02:00
18a04246fa
⬆️ Update ggerganov/llama.cpp ( #2281 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2024-05-09 22:18:49 +00:00
f69de3be0d
models(gallery): ⬆️ update checksum ( #2278 )
...
⬆️ Checksum updates in gallery/index.yaml
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
v2.15.0
2024-05-09 12:21:24 +00:00
650ae620c5
ci: get latest git version
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-09 11:33:16 +02:00
6a209cbef6
ci: get file name correctly in checksum_checker.sh
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-09 10:57:23 +02:00
9786bb826d
ci: try to fix checksum_checker.sh
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-09 09:34:07 +02:00
9b4c6f348a
Update checksum_checker.yaml
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-09 00:57:22 +02:00
cb6ddb21ec
Update checksum_checker.yaml
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-09 00:55:48 +02:00
0baacca605
Update checksum_checker.yaml
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-09 00:54:35 +02:00
222d714ec7
Update checksum_checker.yaml
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-09 00:51:57 +02:00
fd2d89d37b
Update checksum_checker.sh
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-09 00:43:16 +02:00
6440b608dc
Update checksum_checker.yaml
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-09 00:42:48 +02:00
1937118eab
Update checksum_checker.yaml
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2024-05-09 00:34:56 +02:00