LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2024-12-21 05:33:09 +00:00

Author	SHA1	Message	Date
Ettore Di Giacinto	a28ab18987	feat(vllm): Allow to set quantization (#1094 ) This particularly useful to set AWQ Description Follow up of #1015 Notes for Reviewers [Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin) - [ ] Yes, I signed my commits. <!-- Thank you for contributing to LocalAI! Contributing Conventions: 1. Include descriptive PR titles with [<component-name>] prepended. 2. Build and test your changes before submitting a PR. 3. Sign your commits By following the community's contribution conventions upfront, the review process will be accelerated and your PR merged more quickly. --> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-09-22 15:52:38 +02:00
Ettore Di Giacinto	8ccf5b2044	feat(speculative-sampling): allow to specify a draft model in the model config (#1052 ) Description This PR fixes #1013. It adds `draft_model` and `n_draft` to the model YAML config in order to load models with speculative sampling. This should be compatible as well with grammars. example: ```yaml backend: llama context_size: 1024 name: my-model-name parameters: model: foo-bar n_draft: 16 draft_model: model-name ``` --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-09-14 17:44:16 +02:00
Ettore Di Giacinto	dc307a1cc0	feat: add vall-e-x (#1007 ) Description This PR fixes #985 Notes for Reviewers [Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin) - [ ] Yes, I signed my commits. <!-- Thank you for contributing to LocalAI! Contributing Conventions: 1. Include descriptive PR titles with [<component-name>] prepended. 2. Build and test your changes before submitting a PR. 3. Sign your commits By following the community's contribution conventions upfront, the review process will be accelerated and your PR merged more quickly. --> Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-09-04 19:25:23 +02:00
Dave	005f289632	feat: Model Gallery Endpoint Refactor / Mutable Galleries Endpoints (#991 ) refactor for model gallery endpoints - bundle up resources into a struct, make galleries mutable with some crud endpoints. This is groundwork required for making efficient use of the new scraper - while that PR isn't _quite_ ready yet, the goal is to have more, individually smaller gallery files. Therefore, rather than requiring a full localai service restart, these new endpoints have been added to make life easier. - Adds endpoints to add, list and remove model galleries at runtime - Adds these endpoints to the Insomnia config - Minor fix: loading file urls follows symbolic links now	2023-09-02 09:00:44 +02:00
Ettore Di Giacinto	3bab307904	fix(llama): resolve lora adapters correctly from the model file (#964 ) Description we were otherwise expecting absolute paths. this make it relative to the model file (as someone would expect) Notes for Reviewers [Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin) - [ ] Yes, I signed my commits. <!-- Thank you for contributing to LocalAI! Contributing Conventions: 1. Include descriptive PR titles with [<component-name>] prepended. 2. Build and test your changes before submitting a PR. 3. Sign your commits By following the community's contribution conventions upfront, the review process will be accelerated and your PR merged more quickly. -->	2023-08-27 10:11:32 +02:00
Ettore Di Giacinto	44bc7aa3d0	feat: Allow to load lora adapters for llama.cpp (#955 ) Description This PR fixes # Notes for Reviewers [Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin) - [ ] Yes, I signed my commits. <!-- Thank you for contributing to LocalAI! Contributing Conventions: 1. Include descriptive PR titles with [<component-name>] prepended. 2. Build and test your changes before submitting a PR. 3. Sign your commits By following the community's contribution conventions upfront, the review process will be accelerated and your PR merged more quickly. --> Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-25 21:58:46 +02:00
Ettore Di Giacinto	1120847f72	feat: bump llama.cpp, add gguf support (#943 ) Description This PR syncs up the `llama` backend to use `gguf` (https://github.com/go-skynet/go-llama.cpp/pull/180). It also adds `llama-stable` to the targets so we can still load ggml. It adapts the current tests to use the `llama-backend` for ggml and uses a `gguf` model to run tests on the new backend. In order to consume the new version of go-llama.cpp, it also bump go to 1.21 (images, pipelines, etc) --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-24 01:18:58 +02:00
Dave	10b0e13882	feat: backend monitor shutdown endpoint, process based (#938 ) This PR adds a new endpoint to the backend monitor section `/backend/shutdown` which terminates the grpc process for the related model.	2023-08-23 18:38:37 +02:00
Dave	901f0709c5	Feat: rwkv improvements: (#937 )	2023-08-22 18:48:06 +02:00
Ettore Di Giacinto	ab5b75eb01	feat: add llama-stable backend (#932 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-20 16:35:42 +02:00
Ettore Di Giacinto	cc060a283d	fix: drop racy code, refactor and group API schema (#931 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-20 14:04:45 +02:00
Ettore Di Giacinto	afdc0ebfd7	feat: add --single-active-backend to allow only one backend active at the time (#925 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-19 01:49:33 +02:00
Ettore Di Giacinto	1079b18ff7	feat(diffusers): be consistent with pipelines, support also depthimg2img (#926 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-18 22:06:24 +02:00
Dave	8cb1061c11	Usage Features (#863 )	2023-08-18 21:23:14 +02:00
Ettore Di Giacinto	2bacd0180d	feat(diffusers): add img2img and clip_skip, support more kernels schedulers (#906 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-17 23:38:59 +02:00
Ettore Di Giacinto	37700f2d98	feat(diffusers): add DPMSolverMultistepScheduler++, DPMSolverMultistepSchedulerSDE++, guidance_scale (#903 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-16 01:11:42 +02:00
Ettore Di Giacinto	0ec695f9e4	feat: make initializer accept gRPC delay times (#900 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-16 01:11:32 +02:00
Ettore Di Giacinto	a96c3bc885	feat(diffusers): various enhancements (#895 )	2023-08-14 23:12:00 +02:00
Ettore Di Giacinto	8c781a6a44	feat: Add Diffusers (#874 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-09 08:38:51 +02:00
Ettore Di Giacinto	3c8fc37c56	feat: Add UseFastTokenizer Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-08 01:10:05 +02:00
Ettore Di Giacinto	39805b09e5	fix: pass by env in managed services Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-08 00:58:38 +02:00
Ettore Di Giacinto	63b01199fe	fix: match lowercase of the input, not of the model	2023-08-08 00:46:22 +02:00
Ettore Di Giacinto	a843e64fc2	feat: add initial AutoGPTQ backend implementation	2023-08-07 22:53:28 +02:00
Ettore Di Giacinto	acd829a7a0	fix: do not break on newlines on function returns (#864 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-04 21:46:36 +02:00
Ettore Di Giacinto	4aa5dac768	feat: update integer, number and string rules - allow primitives as root types (#862 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-03 23:32:30 +02:00
Ettore Di Giacinto	5ca21ee398	feat: add ngqa and RMSNormEps parameters (#860 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-08-03 00:51:08 +02:00
Dave	7fb8b4191f	feat: "simple" chat/edit/completion template system prompt from config (#856 )	2023-08-03 00:19:55 +02:00
Ettore Di Giacinto	d603a9cbb5	fix(gallery): preload from file should by in YAML format (#846 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-31 21:13:16 +02:00
Dave	ce8e9dc690	feature: model list :: filter query string parameter (#830 )	2023-07-31 19:14:32 +02:00
Dave	8e8d474ae8	refactor: Remove remaining uses of depreciated package `io/ioutil` (#837 )	2023-07-30 11:23:43 +00:00
Dave	5ce0f216cf	Fix: Model Gallery Downloads (#835 )	2023-07-30 09:47:22 +02:00
Ettore Di Giacinto	00ccb8d4f1	fix: set default rope freq base to 10000 during model load Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-29 10:40:56 +02:00
Dave	8b90ac2b1a	1000 -> 10,000 for ropeFreqBase? the error message talks about a default of 10k, so setting this to 10k instead of 1k experimentally.	2023-07-29 02:37:24 -04:00
Ettore Di Giacinto	f085baa77d	fix: set default rope if not specified Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-29 01:07:16 +02:00
Ettore Di Giacinto	096d98c3d9	fix: add rope settings during model load, fix CUDA (#821 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-27 21:56:05 +02:00
Ettore Di Giacinto	b96e30e66c	fix: use bytes in gRPC proto instead of strings (#813 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-27 18:41:04 +02:00
Ettore Di Giacinto	569c1d1163	feat: add rope settings and negative prompt, drop grammar backend (#797 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-25 19:05:27 +02:00
Aman Gupta Karmani	12fe0932c4	feat: cancel stream generation if client disappears (#792 )	2023-07-24 23:10:54 +02:00
Dave	c6bf67f446	feat(llama2): add template for chat messages (#782 ) Co-authored-by: Aman Karmani <aman@tmm1.net> Lays some of the groundwork for LLAMA2 compatibility as well as other future models with complex prompting schemes. Started small refactoring in pkg/model/loader.go regarding template loading. Currently still a part of ModelLoader, but should be easy to add template loading for situations other than overall prompt templates and the new chat-specific per-message templates Adds support for new chat-endpoint-specific, per-message templates as an alternative to the existing Role: XYZ sprintf method. Includes a temporary prompt template as an example, since I have a few questions before we merge in the model-gallery side changes (see ) Minor debug logging changes.	2023-07-22 11:31:39 -04:00
Ettore Di Giacinto	c71c729bc2	debug	2023-07-21 10:53:26 +02:00
Ettore Di Giacinto	94916749c5	feat: add external grpc and model autoloading	2023-07-20 22:10:12 +02:00
Ettore Di Giacinto	47cc95fc9f	feat: add all backends to autoload Now since gRPCs are not crashing the main thread we can just greedly attempt all the backends we have available. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-20 00:40:28 +02:00
Ettore Di Giacinto	3feb632eb4	refactor: rename "llama-master" and "llama" (#776 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-20 00:36:16 +02:00
Ettore Di Giacinto	236497e331	feat: resolve JSONSchema refs (planners) (#774 )	2023-07-19 22:56:13 +02:00
Ettore Di Giacinto	6352448b72	feat: add llama-master backend (#752 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-17 23:58:15 +02:00
Ettore Di Giacinto	1d0ed95a54	feat: move other backends to grpc This finally makes everything more consistent Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-15 01:19:43 +02:00
Ettore Di Giacinto	5dcfdbe51d	feat: various refactorings Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-15 01:19:43 +02:00
Ettore Di Giacinto	f2f1d7fe72	feat: use gRPC for transformers Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-15 01:19:43 +02:00
Ettore Di Giacinto	ae533cadef	feat: move gpt4all to a grpc service Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-15 01:19:43 +02:00
Ettore Di Giacinto	58f6aab637	feat: move llama to a grpc Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2023-07-15 01:19:43 +02:00

1 2

96 Commits