LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2024-12-29 17:08:52 +00:00

Author	SHA1	Message	Date
Ettore Di Giacinto	b9e77d394b	feat(model-help): display help text in markdown (#1825 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-03-13 21:50:46 +01:00
Ettore Di Giacinto	57222497ec	fix(docker-compose): update docker compose file (#1824 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-03-13 17:57:45 +01:00
LocalAI [bot]	5c5f07c1e7	⬆️ Update ggerganov/llama.cpp (#1821 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-03-13 10:05:46 +01:00
Ettore Di Giacinto	f895d06605	fix(config): set better defaults for inferencing (#1822 ) * fix(defaults): set better defaults for inferencing This changeset aim to have better defaults and to properly detect when no inference settings are provided with the model. If not specified, we defaults to mirostat sampling, and offload all the GPU layers (if a GPU is detected). Related to https://github.com/mudler/LocalAI/issues/1373 and https://github.com/mudler/LocalAI/issues/1723 * Adapt tests * Also pre-initialize default seed	2024-03-13 10:05:30 +01:00
Ettore Di Giacinto	bc8f648a91	fix(doc/examples): set defaults to mirostat (#1820 ) The default sampler on some models don't return enough candidates which leads to a false sense of randomness. Tracing back the code it looks that with the temperature sampler there might not be enough candidates to pick from, and since the seed and "randomness" take effect while picking a good candidate this yields to the same results over and over. Fixes https://github.com/mudler/LocalAI/issues/1723 by updating the examples and documentation to use mirostat instead.	2024-03-11 19:49:03 +01:00
LocalAI [bot]	8e57f4df31	⬆️ Update ggerganov/llama.cpp (#1818 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-03-11 00:02:37 +01:00
LocalAI [bot]	a08cc5adbb	⬆️ Update ggerganov/llama.cpp (#1816 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-03-10 09:32:09 +01:00
LocalAI [bot]	595a73fce4	⬆️ Update ggerganov/llama.cpp (#1813 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-03-09 09:27:06 +01:00
LocalAI [bot]	dc919e08e8	⬆️ Update ggerganov/llama.cpp (#1811 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-03-08 08:21:25 +01:00
Ettore Di Giacinto	5d1018495f	feat(intel): add diffusers/transformers support (#1746 ) * feat(intel): add diffusers support * try to consume upstream container image * Debug * Manually install deps * Map transformers/hf cache dir to modelpath if not specified * fix(compel): update initialization, pass by all gRPC options * fix: add dependencies, implement transformers for xpu * base it from the oneapi image * Add pillow * set threads if specified when launching the API * Skip conda install if intel * defaults to non-intel * ci: add to pipelines * prepare compel only if enabled * Skip conda install if intel * fix cleanup * Disable compel by default * Install torch 2.1.0 with Intel * Skip conda on some setups * Detect python * Quiet output * Do not override system python with conda * Prefer python3 * Fixups * exllama2: do not install without conda (overrides pytorch version) * exllama/exllama2: do not install if not using cuda * Add missing dataset dependency * Small fixups, symlink to python, add requirements * Add neural_speed to the deps * correctly handle model offloading * fix: device_map == xpu * go back at calling python, fixed at dockerfile level * Exllama2 restricted to only nvidia gpus * Tokenizer to xpu	2024-03-07 14:37:45 +01:00
LocalAI [bot]	ad6fd7a991	⬆️ Update ggerganov/llama.cpp (#1805 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-03-06 23:28:31 +01:00
LocalAI [bot]	e022b5959e	⬆️ Update mudler/go-stable-diffusion (#1802 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-03-05 23:39:57 +00:00
LocalAI [bot]	db7f4955a1	⬆️ Update ggerganov/llama.cpp (#1801 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-03-05 21:50:27 +00:00
Dave	5c69dd155f	feat(autogpt/transformers): consume `trust_remote_code` (#1799 ) trusting remote code by default is a danger to our users	2024-03-05 19:47:15 +01:00
TwinFin	504f2e8bf4	Update Backend Dependancies (#1797 ) * Update transformers.yml Signed-off-by: TwinFin <57421631+TwinFinz@users.noreply.github.com> * Update transformers-rocm.yml Signed-off-by: TwinFin <57421631+TwinFinz@users.noreply.github.com> * Update transformers-nvidia.yml Signed-off-by: TwinFin <57421631+TwinFinz@users.noreply.github.com> --------- Signed-off-by: TwinFin <57421631+TwinFinz@users.noreply.github.com>	2024-03-05 10:10:00 +00:00
Luna Midori	e586dc2924	Edit links in readme and integrations page (#1796 ) * Update integrations.md Signed-off-by: Luna Midori <118759930+lunamidori5@users.noreply.github.com> * Update README.md Signed-off-by: Luna Midori <118759930+lunamidori5@users.noreply.github.com> * Update README.md Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Signed-off-by: Luna Midori <118759930+lunamidori5@users.noreply.github.com> * Update README.md Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Signed-off-by: Luna Midori <118759930+lunamidori5@users.noreply.github.com> --------- Signed-off-by: Luna Midori <118759930+lunamidori5@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-03-05 10:14:30 +01:00
Ettore Di Giacinto	333f918005	Update integrations.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-03-05 09:45:54 +01:00
LocalAI [bot]	c8e29033c2	⬆️ Update ggerganov/llama.cpp (#1794 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-03-05 08:59:09 +01:00
LocalAI [bot]	d0bd961bde	⬆️ Update ggerganov/llama.cpp (#1791 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-03-04 09:44:21 +01:00
Ettore Di Giacinto	006511ee25	Revert "feat(assistant): Initial implementation of assistants api" (#1790 ) Revert "feat(assistant): Initial implementation of assistants api (#1761)" This reverts commit `4ab72146cd`.	2024-03-03 10:31:06 +01:00
Steven Christou	4ab72146cd	feat(assistant): Initial implementation of assistants api (#1761 ) Initial implementation of assistants api	2024-03-03 08:50:43 +01:00
LocalAI [bot]	b60a3fc879	⬆️ Update ggerganov/llama.cpp (#1789 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-03-03 08:49:23 +01:00
Ettore Di Giacinto	a0eeb74957	Update hot topics/roadmap Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-03-02 09:35:40 +01:00
LocalAI [bot]	daa0b8741c	⬆️ Update ggerganov/llama.cpp (#1785 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-03-01 22:38:24 +00:00
Ludovic Leroux	939411300a	Bump vLLM version + more options when loading models in vLLM (#1782 ) * Bump vLLM version to 0.3.2 * Add vLLM model loading options * Remove transformers-exllama * Fix install exllama	2024-03-01 22:48:53 +01:00
Dave	1c312685aa	refactor: move remaining api packages to core (#1731 ) * core 1 * api/openai/files fix * core 2 - core/config * move over core api.go and tests to the start of core/http * move over localai specific endpoints to core/http, begin the service/endpoint split there * refactor big chunk on the plane * refactor chunk 2 on plane, next step: port and modify changes to request.go * easy fixes for request.go, major changes not done yet * lintfix * json tag lintfix? * gitignore and .keep files * strange fix attempt: rename the config dir?	2024-03-01 16:19:53 +01:00
LocalAI [bot]	316de82f51	⬆️ Update ggerganov/llama.cpp (#1779 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-02-29 22:33:30 +00:00
Ettore Di Giacinto	9068bc5271	Create SECURITY.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-02-29 19:53:04 +01:00
Oussama	31a4c9c9d3	Fix Command Injection Vulnerability (#1778 ) * Added fix for command injection * changed function name from sh to runCommand	2024-02-29 18:32:29 +00:00
Ettore Di Giacinto	c1966af2cf	ci: reduce stress on self-hosted runners (#1776 ) Split jobs by self-hosted and free public runner provided by Github Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-02-29 11:40:08 +01:00
LocalAI [bot]	c665898652	⬆️ Update donomii/go-rwkv.cpp (#1771 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-02-28 23:50:27 +00:00
LocalAI [bot]	f651a660aa	⬆️ Update ggerganov/llama.cpp (#1772 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-02-28 23:02:30 +01:00
Ettore Di Giacinto	ba672b51da	Update README.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-02-28 16:03:38 +01:00
Ettore Di Giacinto	be498c5dd9	Update openai-functions.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-02-28 15:58:31 +01:00
Ettore Di Giacinto	6e95beccb9	Update overview.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-02-28 15:24:08 +01:00
Ettore Di Giacinto	c8be839481	Update openai-functions.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-02-27 23:24:46 +01:00
LocalAI [bot]	c7e08813a5	⬆️ Update ggerganov/llama.cpp (#1767 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-02-27 23:12:51 +01:00
LocalAI [bot]	d21a6b33ab	⬆️ Update ggerganov/llama.cpp (#1756 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-02-27 18:07:51 +00:00
Joshua Waring	9112cf153e	Update integrations.md (#1765 ) Added Jetbrains compatible plugin for LocalAI Signed-off-by: Joshua Waring <Joshhua5@users.noreply.github.com>	2024-02-27 17:35:59 +01:00
Ettore Di Giacinto	3868ac8402	Update README.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-02-27 15:44:15 +01:00
Ettore Di Giacinto	3f09010227	Update README.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-02-27 15:43:15 +01:00
Ettore Di Giacinto	d6cf82aba3	fix(tests): re-enable tests after code move (#1764 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-02-27 15:04:19 +01:00
Ettore Di Giacinto	dfe54639b1	Update README.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-02-27 10:37:56 +01:00
Ettore Di Giacinto	bc5f5aa538	deps(llama.cpp): update (#1759 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-02-26 13:18:44 +01:00
Ettore Di Giacinto	05818e0425	fix(functions): handle correctly when there are no results (#1758 )	2024-02-26 08:38:23 +01:00
Sertaç Özercan	7f72a61104	ci: add stablediffusion to release (#1757 ) Signed-off-by: Sertac Ozercan <sozercan@gmail.com>	2024-02-25 23:06:18 +00:00
LocalAI [bot]	8e45d47740	⬆️ Update ggerganov/llama.cpp (#1753 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-02-25 10:03:19 +01:00
LocalAI [bot]	71771d1e9b	⬆️ Update docs version mudler/LocalAI (#1752 ) Signed-off-by: GitHub <noreply@github.com> Co-authored-by: mudler <mudler@users.noreply.github.com>	2024-02-25 10:02:52 +01:00
Ettore Di Giacinto	aa098e4d0b	fix(sse): do not omit empty finish_reason (#1745 ) Fixes https://github.com/mudler/LocalAI/issues/1744	2024-02-24 11:51:59 +01:00
Ludovic Leroux	0135e1e3b9	fix: vllm - use AsyncLLMEngine to allow true streaming mode (#1749 ) * fix: use vllm AsyncLLMEngine to bring true stream Current vLLM implementation uses the LLMEngine, which was designed for offline batch inference, which results in the streaming mode outputing all blobs at once at the end of the inference. This PR reworks the gRPC server to use asyncio and gRPC.aio, in combination with vLLM's AsyncLLMEngine to bring true stream mode. This PR also passes more parameters to vLLM during inference (presence_penalty, frequency_penalty, stop, ignore_eos, seed, ...). * Remove unused import	2024-02-24 11:48:45 +01:00

... 3 4 5 6 7 ...

1452 Commits