LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2025-04-01 16:41:08 +00:00

Author	SHA1	Message	Date
Ettore Di Giacinto	8814b31805	chore: drop gpt4all.cpp (#3106 ) chore: drop gpt4all gpt4all is already supported in llama.cpp - the backend was kept for keeping compatibility with old gpt4all models (prior to gguf format). It is good time now to clean up and remove it to slim the compilation process. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-08-07 23:35:55 +02:00
Ettore Di Giacinto	e198347886	feat(openai): add `json_schema` format type and strict mode (#3193 ) * feat(openai): add json_schema and strict mode Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * handle err vs _ security scanners prefer if we put these branches in, and I tend to agree. Signed-off-by: Dave <dave@gray101.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Dave <dave@gray101.com> Co-authored-by: Dave <dave@gray101.com>	2024-08-07 15:27:02 -04:00
Ettore Di Giacinto	a36b721ca6	fix: be consistent in downloading files, check for scanner errors (#3108 ) * fix(downloader): be consistent in downloading files This PR puts some order in the downloader such as functions are re-used across several places. This fixes an issue with having uri's inside the model YAML file, it would resolve to MD5 rather then using the filename Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(scanner): do raise error only if unsafeFiles are found Fixes: https://github.com/mudler/LocalAI/issues/3114 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-08-02 20:06:25 +02:00
Ettore Di Giacinto	2a839e1432	fix(gallery): do not attempt to delete duplicate files (#3031 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-28 10:27:56 +02:00
Ettore Di Giacinto	2169c3497d	feat(grammar): add llama3.1 schema (#3015 ) * wip Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * get rid of panics Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * expose it properly from the config Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Simplify Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * forgot to commit Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Remove focus on test Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-26 20:11:29 +02:00
Ettore Di Giacinto	5eda7f578d	refactor: break down json grammar parser in different files (#3004 ) * refactor: break down json grammar parser in different files Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: patch to `refactor_grammars` - propagate errors (#3006) propagate errors around Signed-off-by: Dave Lee <dave@gray101.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Dave Lee <dave@gray101.com> Co-authored-by: Dave <dave@gray101.com>	2024-07-25 08:41:00 +02:00
Ettore Di Giacinto	a9757fb057	fix(cuda): downgrade to 12.0 to increase compatibility range (#2994 ) * fix(cuda): downgrade to 12.0 to increase compatibility range Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * improve messaging Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-23 23:35:31 +02:00
Dave	fc29c04f82	groundwork: add pkg/concurrency and the associated test file (#2745 ) groundwork: add pkg/concurrency and the associated test case Signed-off-by: Dave Lee <dave@gray101.com>	2024-07-18 23:29:21 +00:00
Ettore Di Giacinto	bf9dd1de7f	feat(functions): parse broken JSON when we parse the raw results, use dynamic rules for grammar keys (#2912 ) * feat(functions): enhance parsing with broken JSON when we parse the raw results Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * breaking: make function name by default Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(grammar): dynamically generate grammars with mutating keys Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor: simplify condition Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Update docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-18 17:52:22 +02:00
Ettore Di Giacinto	35d55572ac	fix: do not list txt files as potential models (#2910 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-18 14:44:44 +02:00
Ettore Di Giacinto	642f6cee75	feat(webui): show also models without a config in the welcome page (#2772 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-11 19:55:01 +02:00
Ettore Di Giacinto	b6b8ab6c21	feat(models): pull models from urls (#2750 ) * feat(models): pull models from urls When using `run` now we can point directly to hf models via URL, for instance: ```bash local-ai run huggingface://TheBloke/TinyLlama-1.1B-Chat-v0.3-GGUF/tinyllama-1.1b-chat-v0.3.Q2_K.gguf ``` Will pull the gguf model and place it in the models folder - of course this depends on the fact that the gguf file should be automatically detected by our guesser mechanism in order to this to make effective. Similarly now galleries can refer to single files in the API requests. This also changes the download code and `yaml` files now are treated in the same way, so now config files are saved with the appropriate name (and not hashed anymore). Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Adapt tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-11 15:04:05 +02:00
Ettore Di Giacinto	59ef426fbf	feat(model-list): be consistent, skip known files from listing (#2760 ) fix(model-list): be consistent, skip known files from listing This changeset does two things: - Removes the dependency of listing models from the OpenAI schema. - Tries to reduce confusion between ListModels() in model loader and in the service - now there is only one ListModels which is in services and does not depend anymore on the OpenAI schema - The OpenAI-schema functions were moved nearby the OpenAI specific endpoints that needs the schema - Drops the ListModel Service structure as there was no real need for it. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-10 15:28:39 +02:00
Dave	133987b1fb	feat: HF `/scan` endpoint (#2566 ) * start by checking /scan during the checksum update Signed-off-by: Dave Lee <dave@gray101.com> * add back in golang side features: downloader/uri gets struct and scan function, gallery uses it, and secscan/models calls it. Signed-off-by: Dave Lee <dave@gray101.com> * add a param to scan specific urls - useful for debugging Signed-off-by: Dave Lee <dave@gray101.com> * helpful printouts Signed-off-by: Dave Lee <dave@gray101.com> * fix offsets Signed-off-by: Dave Lee <dave@gray101.com> * fix error and naming Signed-off-by: Dave Lee <dave@gray101.com> * expose error Signed-off-by: Dave Lee <dave@gray101.com> * fix json tags Signed-off-by: Dave Lee <dave@gray101.com> * slight wording change Signed-off-by: Dave Lee <dave@gray101.com> * go mod tidy - getting warnings Signed-off-by: Dave Lee <dave@gray101.com> * split out python to make editing easier, add some simple code to delete contaminated entries from gallery Signed-off-by: Dave Lee <dave@gray101.com> * o7 to my favorite part of our old name, go-skynet Signed-off-by: Dave Lee <dave@gray101.com> * merge fix Signed-off-by: Dave Lee <dave@gray101.com> * merge fix Signed-off-by: Dave Lee <dave@gray101.com> * merge fix Signed-off-by: Dave Lee <dave@gray101.com> * address review comments Signed-off-by: Dave Lee <dave@gray101.com> * forgot secscan could accept multiple URL at once Signed-off-by: Dave Lee <dave@gray101.com> * invert naming and actually use it Signed-off-by: Dave Lee <dave@gray101.com> * missed cli/models.go Signed-off-by: Dave Lee <dave@gray101.com> * Update .github/check_and_update.py Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Signed-off-by: Dave <dave@gray101.com> --------- Signed-off-by: Dave Lee <dave@gray101.com> Signed-off-by: Dave <dave@gray101.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-07-10 13:18:32 +02:00
Ettore Di Giacinto	e591ff2e74	fix(initializer): do select backends that exist (#2694 ) we were not checking if the binary exists before picking these up from the asset dir. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-01 22:50:36 +02:00
Ettore Di Giacinto	bd2f95c130	feat(backend): fallback with autodetect (#2693 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-01 18:11:04 +02:00
Ettore Di Giacinto	3eaf59021c	feat(grammar): expose properties_order (#2662 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-26 14:59:02 +02:00
Ettore Di Giacinto	a181dd0ebc	refactor: gallery inconsistencies (#2647 ) * refactor(gallery): move under core/ Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(unarchive): do not allow symlinks Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-24 17:32:12 +02:00
Dave	12513ebae0	rf: centralize base64 image handling (#2595 ) contains simple fixes to warnings and errors, removes a broken / outdated test, runs go mod tidy, and as the actual change, centralizes base64 image handling Signed-off-by: Dave Lee <dave@gray101.com>	2024-06-24 08:34:36 +02:00
Sertaç Özercan	5866fc8ded	chore: fix go.mod module (#2635 ) Signed-off-by: Sertac Ozercan <sozercan@gmail.com>	2024-06-23 08:24:36 +00:00
Ettore Di Giacinto	8d84dd4f88	fix(worker): use dynaload for single binaries (#2620 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-22 09:33:18 +02:00
Ettore Di Giacinto	f569237a50	feat(oci): support OCI images and Ollama models (#2628 ) * Support specifying oci:// and ollama:// for model URLs Fixes: https://github.com/mudler/LocalAI/issues/2527 Fixes: https://github.com/mudler/LocalAI/issues/1028 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Lower watcher warnings Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Allow to install ollama models from CLI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixup tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Do not keep file ownership Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Skip test on darwin Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-22 08:17:41 +02:00
Ettore Di Giacinto	89a11e15e7	fix(single-binary): bundle ld.so (#2602 ) * debug * fix copy command/silly muscle memory Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * remove tmate * Debugging * Start binary with ld.so if present in libdir Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * small refactor Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-18 22:43:43 +02:00
Ettore Di Giacinto	94cfaad7f4	feat(libpath): refactor and expose functions for external library paths (#2578 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-16 13:58:28 +02:00
Ettore Di Giacinto	112d0ffa45	feat(darwin): embed grpc libs (#2567 ) * debug * feat(makefile): allow to bundle libs into binary * ci: bundle protobuf into single-binary Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(assets): correctly reference extract folder Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * bundle also abseil Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * bundle more libs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-06-14 08:51:25 +02:00
Ettore Di Giacinto	06351cbbb4	feat(binary): support extracted bundled libs on darwin (#2563 ) When offering fallback libs, use the proper env var for darwin Note: this does not include the libraries itself, but only sets the proper env var for the libs to be picked up on darwin. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-13 22:59:42 +02:00
Ettore Di Giacinto	7b205510f9	feat(gallery): uniform download from CLI (#2559 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-13 16:12:46 +02:00
Ettore Di Giacinto	882556d4db	feat(gallery): show available models in website, allow `local-ai models install` to install from galleries (#2555 ) * WIP Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * gen a static page instead (we force DNS redirects to it) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(gallery): install models from CLI, unify install Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Uniform graphic of model page Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Makefile: update targets Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Slightly enhance gallery view Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-13 00:47:16 +02:00
Ettore Di Giacinto	d7e137295a	feat(util): add util command to print GGUF informations (#2528 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-09 19:27:42 +02:00
Ettore Di Giacinto	6c087ae743	feat(arm64): enable single-binary builds (#2490 ) * ci: try to build for arm64 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Allow to skip hipblas on make dist Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * use arm64 cross compiler Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * correctly target go arm64 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * create a separate target Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * cross-compile grpc Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add Protobuf include dirs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * temp disable CUDA build Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * aarch64 builds: Reduce backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Even less backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Even less backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(startup): allow to load libs from extracted assets Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * makefile: set arch Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-09 15:11:37 +02:00
Ettore Di Giacinto	596cf76135	build(intel): bundle intel variants in single-binary (#2494 ) * wip: try to build also intel variants Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add dependencies * Select automatically intel backend --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-06 08:40:51 +02:00
Dave	d072835796	feat:`OpaqueErrors` to hide error information (#2486 ) * adds a new configuration option to hide all error message information from http requests --------- Signed-off-by: Dave Lee <dave@gray101.com>	2024-06-05 08:45:24 +02:00
Ettore Di Giacinto	17cf6c4a4d	feat(amdgpu): try to build in single binary (#2485 ) * feat(amdgpu): try to build in single binary Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Release space from worker Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-05 08:44:15 +02:00
Dave	2fc6fe806b	fix: `pkg/downloader` should respect basePath for `file://` urls (#2481 ) * pass basePath down to pkg/downloader Signed-off-by: Dave Lee <dave@gray101.com> * enforce Signed-off-by: Dave Lee <dave@gray101.com> --------- Signed-off-by: Dave Lee <dave@gray101.com>	2024-06-04 14:32:47 +00:00
Ettore Di Giacinto	bdd6769b2d	feat(default): use number of physical cores as default (#2483 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-04 15:23:29 +02:00
Ettore Di Giacinto	5d31e5269d	feat(functions): allow `response_regex` to be a list (#2447 ) feat(functions): allow regex match to be a list Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-31 22:52:02 +02:00
Ettore Di Giacinto	3f7212c660	feat(functions): better free string matching, allow to expect strings after JSON (#2445 ) Allow now any non-character, both as suffix and prefix when mixed grammars are enabled Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-31 09:36:27 +02:00
Ettore Di Giacinto	669cd06dd9	feat(functions): allow parallel calls with mixed/no grammars (#2432 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-28 21:06:09 +02:00
Ettore Di Giacinto	ea330d452d	models(gallery): add mistral-0.3 and command-r, update functions (#2388 ) * models(gallery): add mistral-0.3 and command-r, update functions Add also disable_parallel_new_lines to disable newlines in the JSON output when forcing parallel tools. Some models (like mistral) might be very sensible to that when being used for function calling. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * models(gallery): add aya-23-8b Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-23 19:16:08 +02:00
Ettore Di Giacinto	491e1d752b	feat(functions): relax mixedgrammars (#2365 ) * feat(functions): relax mixedgrammars Extend even more the functionalities and when mixed mode is enabled, tolerate also both strings and JSON in the result - in this case we make sure that the JSON can be correctly parsed. This also updates the examples and the gallery model to configure the grammar. The changeset also breaks current function/grammar configuration as it reserves now a stanza in the YAML config. For example: ```yaml function: grammar: # This allows the grammar to also return messages mixed_mode: true # Suffix to add to the grammar # prefix: '<tool_call>\n' # Force parallel calls in the grammar # parallel_calls: true ``` Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor, add a way to disable mixed json and freestring Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fix linting issues Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-22 00:14:16 +02:00
Ettore Di Giacinto	1a3dedece0	dependencies(grpcio): bump to fix CI issues (#2362 ) feat(grpcio): bump to fix CI issues Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-21 14:33:47 +02:00
Ettore Di Giacinto	fdb45153fe	feat(llama.cpp): Totally decentralized, private, distributed, p2p inference (#2343 ) * feat(llama.cpp): Enable decentralized, distributed inference As https://github.com/mudler/LocalAI/pull/2324 introduced distributed inferencing thanks to @rgerganov implementation in https://github.com/ggerganov/llama.cpp/pull/6829 in upstream llama.cpp, now it is possible to distribute the workload to remote llama.cpp gRPC server. This changeset now uses mudler/edgevpn to establish a secure, distributed network between the nodes using a shared token. The token is generated automatically when starting the server with the `--p2p` flag, and can be used by starting the workers with `local-ai worker p2p-llama-cpp-rpc` by passing the token via environment variable (TOKEN) or with args (--token). As per how mudler/edgevpn works, a network is established between the server and the workers with dht and mdns discovery protocols, the llama.cpp rpc server is automatically started and exposed to the underlying p2p network so the API server can connect on. When the HTTP server is started, it will discover the workers in the network and automatically create the port-forwards to the service locally. Then llama.cpp is configured to use the services. This feature is behind the "p2p" GO_FLAGS Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * go mod tidy Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: add p2p tag Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * better message Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-20 19:17:59 +02:00
Ettore Di Giacinto	5a6d120a56	feat(functions): don't use yaml.MapSlice (#2354 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-20 08:31:06 +02:00
Ettore Di Giacinto	73566a2bb2	feat(functions): allow to use JSONRegexMatch unconditionally (#2349 ) * feat(functions): allow to use JSONRegexMatch unconditionally Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(functions): make json_regex_match a list Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-19 18:24:49 +02:00
lenaxia	6b6c8cdd5f	feat(functions): Enable true regex replacement for the regexReplacement option (#2341 ) * Adding regex capabilities to ParseFunctionCall replacement Signed-off-by: Lenaxia <github@47north.lat> * Adding tests for the regex replace in ParseFunctionCall Signed-off-by: Lenaxia <github@47north.lat> * Fixing tests and adding a test case to validate double quote replacement works Signed-off-by: Lenaxia <github@47north.lat> * Make Regex replacement stable, drop lookaheads Signed-off-by: mudler <mudler@localai.io> --------- Signed-off-by: Lenaxia <github@47north.lat> Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Lenaxia <github@47north.lat> Co-authored-by: mudler <mudler@localai.io>	2024-05-19 01:29:10 +02:00
Ettore Di Giacinto	02f1b477df	feat(functions): simplify parsing, read functions as list (#2340 ) Signed-off-by: mudler <mudler@localai.io>	2024-05-18 09:35:28 +02:00
Ettore Di Giacinto	beb598e4f9	feat(functions): mixed JSON BNF grammars (#2328 ) feat(functions): support mixed JSON BNF grammar This PR provides new options to control how functions are extracted from the LLM, and also provides more control on how JSON grammars can be used (also in conjunction). New YAML settings introduced: - `grammar_message`: when enabled, the generated grammar can also decide to push strings and not only JSON objects. This allows the LLM to pick to either respond freely or using JSON. - `grammar_prefix`: Allows to prefix a string to the JSON grammar definition. - `replace_results`: Is a map that allows to replace strings in the LLM result. As an example, consider the following settings for Hermes-2-Pro-Mistral, which allow extracting both JSON results coming from the model, and the ones coming from the grammar: ```yaml function: # disable injecting the "answer" tool disable_no_action: true # This allows the grammar to also return messages grammar_message: true # Suffix to add to the grammar grammar_prefix: '<tool_call>\n' return_name_in_function_response: true # Without grammar uncomment the lines below # Warning: this is relying only on the capability of the # LLM model to generate the correct function call. # no_grammar: true # json_regex_match: "(?s)<tool_call>(.*?)</tool_call>" replace_results: "<tool_call>": "" "\'": "\"" ``` Note: To disable entirely grammars usage in the example above, uncomment the `no_grammar` and `json_regex_match`. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-15 20:03:18 +02:00
Ettore Di Giacinto	c89271b2e4	feat(llama.cpp): add distributed llama.cpp inferencing (#2324 ) * feat(llama.cpp): support distributed llama.cpp Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: let tweak how chat messages are merged together Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Makefile: register to ALL_GRPC_BACKENDS Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring, allow disable auto-detection of backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * minor fixups Signed-off-by: mudler <mudler@localai.io> * feat: add cmd to start rpc-server from llama.cpp Signed-off-by: mudler <mudler@localai.io> * ci: add ccache Signed-off-by: mudler <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: mudler <mudler@localai.io>	2024-05-15 01:17:02 +02:00
Sertaç Özercan	a670318a9f	feat: auto select llama-cpp cuda runtime (#2306 ) * auto select cpu variant Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * remove cuda target for now Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix metal Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix path Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * cuda Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * auto select cuda Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * update test Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * select CUDA backend only if present Signed-off-by: mudler <mudler@localai.io> * ci: keep cuda bin in path Signed-off-by: mudler <mudler@localai.io> * Makefile: make dist now builds also cuda Signed-off-by: mudler <mudler@localai.io> * Keep pushing fallback in case auto-flagset/nvidia fails There could be other reasons for which the default binary may fail. For example we might have detected an Nvidia GPU, however the user might not have the drivers/cuda libraries installed in the system, and so it would fail to start. We keep the fallback of llama.cpp at the end of the llama.cpp backends to try to fallback loading in case things go wrong Signed-off-by: mudler <mudler@localai.io> * Do not build cuda on MacOS Signed-off-by: mudler <mudler@localai.io> * cleanup Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * Apply suggestions from code review Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: mudler <mudler@localai.io>	2024-05-14 19:40:18 +02:00
Ettore Di Giacinto	84e2407afa	feat(functions): allow to set JSON matcher (#2319 ) Signed-off-by: mudler <mudler@localai.io>	2024-05-14 09:39:20 +02:00

1 2 3 4 5

214 Commits