Commit Graph

1372 Commits

Author SHA1 Message Date
Koen Farell
36da11a0ee
deps: Update version of vLLM to add support of Cohere Command_R model in vLLM inference (#1975)
* Update vLLM version to add support of Command_R

Signed-off-by: Koen Farell <hellios.dt@gmail.com>

* fix: Fixed vllm version from requirements

Signed-off-by: Koen Farell <hellios.dt@gmail.com>

* chore: Update transformers-rocm.yml

Signed-off-by: Koen Farell <hellios.dt@gmail.com>

* chore: Update transformers.yml version of vllm

Signed-off-by: Koen Farell <hellios.dt@gmail.com>

---------

Signed-off-by: Koen Farell <hellios.dt@gmail.com>
2024-04-10 11:25:26 +00:00
Sebastian.W
d23e73b118
fix(autogptq): do not use_triton with qwen-vl (#1985)
* Enhance autogptq backend to support VL models

* update dependencies for autogptq

* remove redundant auto-gptq dependency

* Convert base64 to image_url for Qwen-VL model

* implemented model inference for qwen-vl

* remove user prompt from generated answer

* fixed write image error

* fixed use_triton issue when loading Qwen-VL model

---------

Co-authored-by: Binghua Wu <bingwu@estee.com>
2024-04-10 10:36:10 +00:00
Ettore Di Giacinto
d692b2c32a
ci: push latest images for dockerhub (#1984)
Fixes: #1983

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-10 10:31:59 +02:00
LocalAI [bot]
7e2f8bb408
⬆️ Update ggerganov/whisper.cpp (#1980)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-10 09:08:00 +02:00
LocalAI [bot]
951e39d36c
⬆️ Update ggerganov/llama.cpp (#1979)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-10 09:07:41 +02:00
LocalAI [bot]
aeb3f835ae
⬆️ Update docs version mudler/LocalAI (#1978)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-10 09:07:21 +02:00
Ettore Di Giacinto
cc3d601836
ci: fixup latest image push
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-09 09:49:11 +02:00
Ettore Di Giacinto
2bbb221fb1 tests(petals): temp disable 2024-04-08 21:28:59 +00:00
LocalAI [bot]
195be10050
⬆️ Update ggerganov/llama.cpp (#1973)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-08 23:26:52 +02:00
fakezeta
a38618db02
fix regression #1971 (#1972)
fixes regression #1971 introduced by intel_extension_for_transformers==1.4
2024-04-08 22:33:51 +02:00
LocalAI [bot]
efcca15d3f
⬆️ Update ggerganov/llama.cpp (#1970)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-08 08:38:47 +02:00
LocalAI [bot]
a153b628c2
⬆️ Update ggerganov/whisper.cpp (#1969)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-08 08:38:17 +02:00
Ettore Di Giacinto
f36d86ba6d
fix(hermes-2-pro-mistral): correct dashes in template to suppress newlines (#1966)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-07 18:23:47 +02:00
Ettore Di Giacinto
74492a81c7
doc(quickstart): fix typo
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-07 11:06:35 +02:00
LocalAI [bot]
ed13782986
⬆️ Update ggerganov/llama.cpp (#1964)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-07 10:32:10 +02:00
Ettore Di Giacinto
8342553214
fix(llama.cpp): set better defaults for llama.cpp (#1961)
fix(defaults): set better defaults for llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-06 22:56:45 +02:00
LocalAI [bot]
8aa5f5a660
⬆️ Update ggerganov/llama.cpp (#1960)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-06 19:15:25 +00:00
LocalAI [bot]
b2d9e3f704
⬆️ Update ggerganov/llama.cpp (#1959)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-05 08:41:55 +02:00
LocalAI [bot]
f744e1f931
⬆️ Update ggerganov/whisper.cpp (#1958)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-05 08:41:35 +02:00
cryptk
b85dad0286
feat: first pass at improving logging (#1956)
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-04 09:24:22 +02:00
LocalAI [bot]
3851b51d98
⬆️ Update ggerganov/llama.cpp (#1953)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-04 00:27:57 +02:00
Ettore Di Giacinto
ff77d3bc22
fix(seed): generate random seed per-request if -1 is set (#1952)
* fix(seed): generate random seed per-request if -1 is set

Also update ci with new workflows and allow the aio tests to run with an
api key

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* docs(openvino): Add OpenVINO example

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-03 22:25:47 +02:00
Ettore Di Giacinto
93cfec3c32 ci: correctly tag latest and aio images 2024-04-03 11:30:23 +02:00
Ettore Di Giacinto
89560ef87f
fix(ci): manually tag latest images (#1948)
fix(ci): manually tag images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-02 19:25:46 +02:00
Ettore Di Giacinto
9bc209ba73
fix(welcome): stable model list (#1949) 2024-04-02 19:25:32 +02:00
Ettore Di Giacinto
84e0dc3246
fix(hermes-2-pro-mistral): correct stopwords (#1947)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-02 15:38:00 +02:00
LocalAI [bot]
4d4d76114d
⬆️ Update ggerganov/llama.cpp (#1941)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-02 09:16:04 +02:00
cryptk
86bc5f1350
fix: use exec in entrypoint scripts to fix signal handling (#1943) 2024-04-02 09:15:44 +02:00
Ettore Di Giacinto
e8f02c083f
fix(functions): respect when selected from string (#1940)
* fix(functions): respect when selected from string

* fix(toolschoice): decode both string and objects
2024-04-01 19:39:54 +02:00
Ettore Di Giacinto
ebb1fcedea
fix(hermes-2-pro-mistral): add stopword for toolcall (#1939)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-01 11:48:35 +02:00
LocalAI [bot]
66f90f8dc1
⬆️ Update ggerganov/llama.cpp (#1937)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-01 08:59:23 +02:00
Ettore Di Giacinto
3c778b538a
Update phi-2-orange.yaml
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-31 13:06:41 +02:00
Ettore Di Giacinto
35290e146b
fix(grammar): respect JSONmode and grammar from user input (#1935)
* fix(grammar): Fix JSON mode and custom grammar

* tests(aio): add jsonmode test

* tests(aio): add functioncall test

* fix(aio): use hermes-2-pro-mistral as llm for CPU profile

* add phi-2-orange
2024-03-31 13:04:09 +02:00
LocalAI [bot]
784657a652
⬆️ Update ggerganov/llama.cpp (#1934)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-31 00:27:38 +01:00
LocalAI [bot]
831efa8893
⬆️ Update ggerganov/whisper.cpp (#1933)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-31 00:27:16 +01:00
Ettore Di Giacinto
957f428fd5
fix(tools): correctly render tools response in templates (#1932)
* fix(tools): allow to correctly display both Functions and Tools

* models(hermes-2-pro): correctly display function results
2024-03-30 19:02:07 +01:00
Ettore Di Giacinto
61e5e6bc36
fix(swagger): do not specify a host (#1930)
In this way the requests are redirected to the host used by the client
to perform the request.
2024-03-30 12:04:41 +01:00
Ettore Di Giacinto
eab4a91a9b
fix(aio): correctly detect intel systems (#1931)
Also rename SIZE to PROFILE
2024-03-30 12:04:32 +01:00
LocalAI [bot]
2bba62ca4d
⬆️ Update ggerganov/llama.cpp (#1928)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-29 22:52:01 +00:00
Ettore Di Giacinto
bcdc83b46d
Update quickstart.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-29 23:00:06 +01:00
Ettore Di Giacinto
92fbdfd06f
feat(swagger): update (#1929) 2024-03-29 22:48:58 +01:00
cryptk
93702e39d4
feat(build): adjust number of parallel make jobs (#1915)
* feat(build): adjust number of parallel make jobs

* fix: update make on MacOS from brew to support --output-sync argument

* fix: cache grpc with version as part of key to improve validity of cache hits

* fix: use gmake for tests-apple to use the updated GNU make version

* fix: actually use the new make version for tests-apple

* feat: parallelize tests-extra

* feat: attempt to cache grpc build for docker images

* fix: don't quote GRPC version

* fix: don't cache go modules, we have limited cache space, better used elsewhere

* fix: release with the same version of go that we test with

* fix: don't fail on exporting cache layers

* fix: remove deprecated BUILD_GRPC docker arg from Makefile
2024-03-29 22:32:40 +01:00
LocalAI [bot]
a7fc89c207
⬆️ Update ggerganov/whisper.cpp (#1927)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-29 22:29:50 +01:00
Ettore Di Giacinto
123a5a2e16
feat(swagger): Add swagger API doc (#1926)
* makefile(build): add minimal and api build target

* feat(swagger): Add swagger
2024-03-29 22:29:33 +01:00
LocalAI [bot]
ab2f403dd0
⬆️ Update ggerganov/whisper.cpp (#1924)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-29 00:13:59 +01:00
LocalAI [bot]
b9c5e14e2c
⬆️ Update ggerganov/llama.cpp (#1923)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-29 00:13:38 +01:00
Ettore Di Giacinto
bf65ed6eb8
feat(webui): add partials, show backends associated to models (#1922)
* feat(webui): add partials, show backends associated to models

* fix(auth): put assistant and backend under auth
2024-03-28 21:52:52 +01:00
Ettore Di Giacinto
4e79294f97
Update README.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-28 19:52:40 +01:00
Ettore Di Giacinto
8477e8fac3
Update quickstart.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-28 18:28:30 +01:00
Ettore Di Giacinto
13ccd2afef
docs(aio-usage): update docs to show examples (#1921)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-03-28 18:16:58 +01:00