LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2024-12-21 05:33:09 +00:00

Author	SHA1	Message	Date
Ettore Di Giacinto	beb598e4f9	feat(functions): mixed JSON BNF grammars (#2328 ) feat(functions): support mixed JSON BNF grammar This PR provides new options to control how functions are extracted from the LLM, and also provides more control on how JSON grammars can be used (also in conjunction). New YAML settings introduced: - `grammar_message`: when enabled, the generated grammar can also decide to push strings and not only JSON objects. This allows the LLM to pick to either respond freely or using JSON. - `grammar_prefix`: Allows to prefix a string to the JSON grammar definition. - `replace_results`: Is a map that allows to replace strings in the LLM result. As an example, consider the following settings for Hermes-2-Pro-Mistral, which allow extracting both JSON results coming from the model, and the ones coming from the grammar: ```yaml function: # disable injecting the "answer" tool disable_no_action: true # This allows the grammar to also return messages grammar_message: true # Suffix to add to the grammar grammar_prefix: '<tool_call>\n' return_name_in_function_response: true # Without grammar uncomment the lines below # Warning: this is relying only on the capability of the # LLM model to generate the correct function call. # no_grammar: true # json_regex_match: "(?s)<tool_call>(.*?)</tool_call>" replace_results: "<tool_call>": "" "\'": "\"" ``` Note: To disable entirely grammars usage in the example above, uncomment the `no_grammar` and `json_regex_match`. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-15 20:03:18 +02:00
Ettore Di Giacinto	c89271b2e4	feat(llama.cpp): add distributed llama.cpp inferencing (#2324 ) * feat(llama.cpp): support distributed llama.cpp Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: let tweak how chat messages are merged together Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Makefile: register to ALL_GRPC_BACKENDS Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring, allow disable auto-detection of backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * minor fixups Signed-off-by: mudler <mudler@localai.io> * feat: add cmd to start rpc-server from llama.cpp Signed-off-by: mudler <mudler@localai.io> * ci: add ccache Signed-off-by: mudler <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: mudler <mudler@localai.io>	2024-05-15 01:17:02 +02:00
Ettore Di Giacinto	29909666c3	Update README.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-15 00:33:16 +02:00
LocalAI [bot]	566b5cf2ee	⬆️ Update ggerganov/whisper.cpp (#2326 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-14 21:17:46 +00:00
Sertaç Özercan	a670318a9f	feat: auto select llama-cpp cuda runtime (#2306 ) * auto select cpu variant Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * remove cuda target for now Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix metal Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix path Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * cuda Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * auto select cuda Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * update test Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * select CUDA backend only if present Signed-off-by: mudler <mudler@localai.io> * ci: keep cuda bin in path Signed-off-by: mudler <mudler@localai.io> * Makefile: make dist now builds also cuda Signed-off-by: mudler <mudler@localai.io> * Keep pushing fallback in case auto-flagset/nvidia fails There could be other reasons for which the default binary may fail. For example we might have detected an Nvidia GPU, however the user might not have the drivers/cuda libraries installed in the system, and so it would fail to start. We keep the fallback of llama.cpp at the end of the llama.cpp backends to try to fallback loading in case things go wrong Signed-off-by: mudler <mudler@localai.io> * Do not build cuda on MacOS Signed-off-by: mudler <mudler@localai.io> * cleanup Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * Apply suggestions from code review Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: mudler <mudler@localai.io>	2024-05-14 19:40:18 +02:00
Ettore Di Giacinto	84e2407afa	feat(functions): allow to set JSON matcher (#2319 ) Signed-off-by: mudler <mudler@localai.io>	2024-05-14 09:39:20 +02:00
Ettore Di Giacinto	c4186f13c3	feat(functions): support models with no grammar and no regex (#2315 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-14 00:32:32 +02:00
LocalAI [bot]	4ac7956f68	⬆️ Update ggerganov/whisper.cpp (#2317 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-13 22:25:14 +00:00
Ettore Di Giacinto	e49ea0123b	feat(llama.cpp): add `flash_attention` and `no_kv_offloading` (#2310 ) feat(llama.cpp): add flash_attn and no_kv_offload Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-13 19:07:51 +02:00
Ettore Di Giacinto	7123d07456	models(gallery): add orthocopter (#2313 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-13 18:45:58 +02:00
Ettore Di Giacinto	2db22087ae	models(gallery): add lumimaidv2 (#2312 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-13 18:44:44 +02:00
Ettore Di Giacinto	fa7b2aee9c	models(gallery): add Bunny-llama (#2311 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-13 18:44:25 +02:00
Ettore Di Giacinto	4d70b6fb2d	models(gallery): add aura-llama-Abliterated (#2309 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-13 18:44:10 +02:00
Sertaç Özercan	e2c3ffb09b	feat: auto select llama-cpp cpu variant (#2305 ) * auto select cpu variant Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * remove cuda target for now Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix metal Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix path Signed-off-by: Sertac Ozercan <sozercan@gmail.com> --------- Signed-off-by: Sertac Ozercan <sozercan@gmail.com>	2024-05-13 11:37:52 +02:00
LocalAI [bot]	b4cb22f444	⬆️ Update ggerganov/llama.cpp (#2303 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-12 21:18:59 +00:00
LocalAI [bot]	5534b13903	feat(swagger): update swagger (#2302 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-12 21:00:18 +00:00
fakezeta	5b79bd04a7	add setuptools for openvino (#2301 )	2024-05-12 19:31:43 +00:00
Ettore Di Giacinto	9d8c705fd9	feat(ui): display number of available models for installation (#2298 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-12 14:24:36 +02:00
Ettore Di Giacinto	310b2171be	models(gallery): add llama-3-refueled (#2297 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-12 09:39:58 +02:00
Ettore Di Giacinto	98af0b5d85	models(gallery): add jsl-medllama-3-8b-v2.0 (#2296 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-12 09:38:05 +02:00
Ettore Di Giacinto	ca14f95d2c	models(gallery): add l3-chaoticsoliloquy-v1.5-4x8b (#2295 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-12 09:37:55 +02:00
Ikko Eltociear Ashimine	1b69b338c0	docs: Update semantic-todo/README.md (#2294 ) seperate -> separate Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com>	2024-05-12 09:02:11 +02:00
cryptk	88942e4761	fix: add missing openvino/optimum/etc libraries for Intel, fixes #2289 (#2292 ) Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-05-12 09:01:45 +02:00
Ettore Di Giacinto	efa32a2677	feat(grammar): support models with specific construct (#2291 ) When enabling grammar with functions, it might be useful to allow more flexibility to support models that are fine-tuned against returning function calls of the form of { "name": "function_name", "arguments" {...} } rather then { "function": "function_name", "arguments": {..} }. This might call out to a more generic approach later on, but for the moment being we can easily support both as we have just to specific different types. If needed we can expand on this later on Signed-off-by: mudler <mudler@localai.io>	2024-05-12 01:13:22 +02:00
LocalAI [bot]	dfc420706c	⬆️ Update ggerganov/llama.cpp (#2290 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-11 21:16:34 +00:00
cryptk	e2de8a88f7	feat: create bash library to handle install/run/test of python backends (#2286 ) * feat: create bash library to handle install/run/test of python backends Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * chore: minor cleanup Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: remove incorrect LIMIT_TARGETS from parler-tts Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: update runUnitests to handle running tests from a custom test file Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * chore: document runUnittests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-05-11 18:32:46 +02:00
Ettore Di Giacinto	7f4febd6c2	models(gallery): add Llama-3-8B-Instruct-abliterated (#2288 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-11 10:10:57 +02:00
LocalAI [bot]	93e581dfd0	⬆️ Update ggerganov/llama.cpp (#2285 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-10 21:09:22 +00:00
Ettore Di Giacinto	cf513efa78	Update openai-functions.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-10 17:09:51 +02:00
Ettore Di Giacinto	9e8b34427a	Update openai-functions.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-10 17:05:16 +02:00
Ettore Di Giacinto	88d0aa1e40	docs: update function docs Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-10 17:03:56 +02:00
Ettore Di Giacinto	9b09eb005f	build: do not specify a BUILD_ID by default (#2284 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-10 16:01:55 +02:00
Ettore Di Giacinto	4db41b71f3	models(gallery): add aloe (#2283 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-10 16:01:47 +02:00
cryptk	28a421cb1d	feat: migrate python backends from conda to uv (#2215 ) * feat: migrate diffusers backend from conda to uv - replace conda with UV for diffusers install (prototype for all extras backends) - add ability to build docker with one/some/all extras backends instead of all or nothing Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate autogtpq bark coqui from conda to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: convert exllama over to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate exllama2 to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate mamba to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate parler to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate petals to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: fix tests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate rerankers to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate sentencetransformers to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: install uv for tests-linux Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: make sure file exists before installing on intel images Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate transformers backend to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate transformers-musicgen to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate vall-e-x to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate vllm to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add uv install to the rest of test-extra.yml Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: adjust file perms on all install/run/test scripts Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add missing acclerate dependencies Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add some more missing dependencies to python backends Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: parler tests venv py dir fix Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: correct filename for transformers-musicgen tests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: adjust the pwd for valle tests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: cleanup and optimization work for uv migration Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add setuptools to requirements-install for mamba Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: more size optimization work Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: make installs and tests more consistent, cleanup some deps Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: cleanup Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: mamba backend is cublas only Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: uncomment lines in makefile Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-05-10 15:08:08 +02:00
LocalAI [bot]	e6768097f4	⬆️ Update docs version mudler/LocalAI (#2280 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-10 09:10:00 +02:00
LocalAI [bot]	18a04246fa	⬆️ Update ggerganov/llama.cpp (#2281 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-09 22:18:49 +00:00
LocalAI [bot]	f69de3be0d	models(gallery): ⬆️ update checksum (#2278 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-09 12:21:24 +00:00
Ettore Di Giacinto	650ae620c5	ci: get latest git version Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 11:33:16 +02:00
Ettore Di Giacinto	6a209cbef6	ci: get file name correctly in checksum_checker.sh Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 10:57:23 +02:00
Ettore Di Giacinto	9786bb826d	ci: try to fix checksum_checker.sh Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 09:34:07 +02:00
Ettore Di Giacinto	9b4c6f348a	Update checksum_checker.yaml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:57:22 +02:00
Ettore Di Giacinto	cb6ddb21ec	Update checksum_checker.yaml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:55:48 +02:00
Ettore Di Giacinto	0baacca605	Update checksum_checker.yaml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:54:35 +02:00
Ettore Di Giacinto	222d714ec7	Update checksum_checker.yaml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:51:57 +02:00
Ettore Di Giacinto	fd2d89d37b	Update checksum_checker.sh Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:43:16 +02:00
Ettore Di Giacinto	6440b608dc	Update checksum_checker.yaml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:42:48 +02:00
Ettore Di Giacinto	1937118eab	Update checksum_checker.yaml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:34:56 +02:00
Ettore Di Giacinto	bc272d1e4b	ci: add checksum checker pipeline (#2274 ) Signed-off-by: mudler <mudler@localai.io>	2024-05-09 00:31:27 +02:00
LocalAI [bot]	d651f390cd	⬆️ Update ggerganov/whisper.cpp (#2273 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-08 22:11:10 +00:00
Ettore Di Giacinto	ea777f8716	models(gallery): update SHA for einstein Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-08 23:40:58 +02:00

... 5 6 7 8 9 ...

1957 Commits