From fec01d9e6955a7347cd05c317bd24fd047da0cc6 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 16 Sep 2024 21:00:35 +0000 Subject: [PATCH 001/122] chore(deps): Bump docs/themes/hugo-theme-relearn from `f696f60` to `d5a0ee0` (#3558) chore(deps): Bump docs/themes/hugo-theme-relearn Bumps [docs/themes/hugo-theme-relearn](https://github.com/McShelby/hugo-theme-relearn) from `f696f60` to `d5a0ee0`. - [Release notes](https://github.com/McShelby/hugo-theme-relearn/releases) - [Commits](https://github.com/McShelby/hugo-theme-relearn/compare/f696f60f4e44e18a34512b895a7b65a72c801bd8...d5a0ee04ad986394d6d2f1e1a57f2334d24bf317) --- updated-dependencies: - dependency-name: docs/themes/hugo-theme-relearn dependency-type: direct:production ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- docs/themes/hugo-theme-relearn | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/themes/hugo-theme-relearn b/docs/themes/hugo-theme-relearn index f696f60f..d5a0ee04 160000 --- a/docs/themes/hugo-theme-relearn +++ b/docs/themes/hugo-theme-relearn @@ -1 +1 @@ -Subproject commit f696f60f4e44e18a34512b895a7b65a72c801bd8 +Subproject commit d5a0ee04ad986394d6d2f1e1a57f2334d24bf317 From 2edc732c3398599b1d86a8930286ccc9fd3762e3 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 16 Sep 2024 21:23:06 +0000 Subject: [PATCH 002/122] chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/coqui (#3554) chore(deps): Bump setuptools in /backend/python/coqui Bumps [setuptools](https://github.com/pypa/setuptools) from 72.1.0 to 75.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v72.1.0...v75.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/coqui/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/coqui/requirements-intel.txt b/backend/python/coqui/requirements-intel.txt index 002a55c3..c0e4dcaa 100644 --- a/backend/python/coqui/requirements-intel.txt +++ b/backend/python/coqui/requirements-intel.txt @@ -3,6 +3,6 @@ intel-extension-for-pytorch torch torchaudio optimum[openvino] -setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 +setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406 transformers accelerate \ No newline at end of file From a5ce987bdbd98b6c8659a92dfbcc9d99bbf52f5f Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 16 Sep 2024 21:35:10 +0000 Subject: [PATCH 003/122] chore(deps): Bump langchain from 0.2.16 to 0.3.0 in /examples/functions (#3559) Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.16 to 0.3.0. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.16...langchain==0.3.0) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/functions/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt index 8258885a..9dd6818f 100644 --- a/examples/functions/requirements.txt +++ b/examples/functions/requirements.txt @@ -1,2 +1,2 @@ -langchain==0.2.16 +langchain==0.3.0 openai==1.44.0 From 149cc1eb13d3bd9af76ed13d72bff02cc685e601 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 16 Sep 2024 21:44:34 +0000 Subject: [PATCH 004/122] chore(deps): Bump openai from 1.44.1 to 1.45.1 in /examples/langchain-chroma (#3556) chore(deps): Bump openai in /examples/langchain-chroma Bumps [openai](https://github.com/openai/openai-python) from 1.44.1 to 1.45.1. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.44.1...v1.45.1) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index c9bce6e9..3edb570c 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.2.16 -openai==1.44.1 +openai==1.45.1 chromadb==0.5.5 llama-index==0.11.7 \ No newline at end of file From 09c7d8d4587f9e09cd6e50534b2677abc85db1ee Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 16 Sep 2024 21:46:26 +0000 Subject: [PATCH 005/122] chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/autogptq (#3553) chore(deps): Bump setuptools in /backend/python/autogptq Bumps [setuptools](https://github.com/pypa/setuptools) from 72.1.0 to 75.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v72.1.0...v75.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/autogptq/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/autogptq/requirements-intel.txt b/backend/python/autogptq/requirements-intel.txt index 755e19d8..d5e0173e 100644 --- a/backend/python/autogptq/requirements-intel.txt +++ b/backend/python/autogptq/requirements-intel.txt @@ -2,4 +2,4 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From 12a8d0e46fbd03f8d550dc41ea6325d07d66cd00 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 16 Sep 2024 21:57:16 +0000 Subject: [PATCH 006/122] chore(deps): Bump securego/gosec from 2.21.0 to 2.21.2 (#3561) Bumps [securego/gosec](https://github.com/securego/gosec) from 2.21.0 to 2.21.2. - [Release notes](https://github.com/securego/gosec/releases) - [Changelog](https://github.com/securego/gosec/blob/master/.goreleaser.yml) - [Commits](https://github.com/securego/gosec/compare/v2.21.0...v2.21.2) --- updated-dependencies: - dependency-name: securego/gosec dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- .github/workflows/secscan.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/secscan.yaml b/.github/workflows/secscan.yaml index db9db586..08d7dfc6 100644 --- a/.github/workflows/secscan.yaml +++ b/.github/workflows/secscan.yaml @@ -18,7 +18,7 @@ jobs: if: ${{ github.actor != 'dependabot[bot]' }} - name: Run Gosec Security Scanner if: ${{ github.actor != 'dependabot[bot]' }} - uses: securego/gosec@v2.21.0 + uses: securego/gosec@v2.21.2 with: # we let the report trigger content trigger a failure using the GitHub Security features. args: '-no-fail -fmt sarif -out results.sarif ./...' From afb5bbc1b88f71454a8b6081f8f8d46ad0eb9b35 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 16 Sep 2024 23:03:06 +0000 Subject: [PATCH 007/122] chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in /backend/python/transformers-musicgen (#3564) chore(deps): Bump setuptools in /backend/python/transformers-musicgen Bumps [setuptools](https://github.com/pypa/setuptools) from 69.5.1 to 75.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v69.5.1...v75.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/transformers-musicgen/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/transformers-musicgen/requirements-intel.txt b/backend/python/transformers-musicgen/requirements-intel.txt index 89bfa6a2..608d6939 100644 --- a/backend/python/transformers-musicgen/requirements-intel.txt +++ b/backend/python/transformers-musicgen/requirements-intel.txt @@ -4,4 +4,4 @@ transformers accelerate torch optimum[openvino] -setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From 30fe16310035d3942368745f17d1673c889a4ddc Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 16 Sep 2024 23:13:09 +0000 Subject: [PATCH 008/122] chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/parler-tts (#3565) chore(deps): Bump setuptools in /backend/python/parler-tts Bumps [setuptools](https://github.com/pypa/setuptools) from 72.1.0 to 75.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v72.1.0...v75.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/parler-tts/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/parler-tts/requirements-intel.txt b/backend/python/parler-tts/requirements-intel.txt index 002a55c3..c0e4dcaa 100644 --- a/backend/python/parler-tts/requirements-intel.txt +++ b/backend/python/parler-tts/requirements-intel.txt @@ -3,6 +3,6 @@ intel-extension-for-pytorch torch torchaudio optimum[openvino] -setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 +setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406 transformers accelerate \ No newline at end of file From 5356b81b7f112c57dcc8a215b1f14c86e7ee3f40 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 16 Sep 2024 23:40:39 +0000 Subject: [PATCH 009/122] chore(deps): Bump sentence-transformers from 3.0.1 to 3.1.0 in /backend/python/sentencetransformers (#3566) chore(deps): Bump sentence-transformers Bumps [sentence-transformers](https://github.com/UKPLab/sentence-transformers) from 3.0.1 to 3.1.0. - [Release notes](https://github.com/UKPLab/sentence-transformers/releases) - [Commits](https://github.com/UKPLab/sentence-transformers/compare/v3.0.1...v3.1.0) --- updated-dependencies: - dependency-name: sentence-transformers dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/sentencetransformers/requirements-cpu.txt | 2 +- backend/python/sentencetransformers/requirements-cublas11.txt | 2 +- backend/python/sentencetransformers/requirements-cublas12.txt | 2 +- backend/python/sentencetransformers/requirements-hipblas.txt | 2 +- backend/python/sentencetransformers/requirements-intel.txt | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/backend/python/sentencetransformers/requirements-cpu.txt b/backend/python/sentencetransformers/requirements-cpu.txt index cd9924ef..f88de1e4 100644 --- a/backend/python/sentencetransformers/requirements-cpu.txt +++ b/backend/python/sentencetransformers/requirements-cpu.txt @@ -2,5 +2,5 @@ torch accelerate transformers bitsandbytes -sentence-transformers==3.0.1 +sentence-transformers==3.1.0 transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-cublas11.txt b/backend/python/sentencetransformers/requirements-cublas11.txt index 1131f066..57caf1a1 100644 --- a/backend/python/sentencetransformers/requirements-cublas11.txt +++ b/backend/python/sentencetransformers/requirements-cublas11.txt @@ -1,5 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/cu118 torch accelerate -sentence-transformers==3.0.1 +sentence-transformers==3.1.0 transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-cublas12.txt b/backend/python/sentencetransformers/requirements-cublas12.txt index 2936e17b..834fa6a4 100644 --- a/backend/python/sentencetransformers/requirements-cublas12.txt +++ b/backend/python/sentencetransformers/requirements-cublas12.txt @@ -1,4 +1,4 @@ torch accelerate -sentence-transformers==3.0.1 +sentence-transformers==3.1.0 transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-hipblas.txt b/backend/python/sentencetransformers/requirements-hipblas.txt index 3b187c68..98a0a41b 100644 --- a/backend/python/sentencetransformers/requirements-hipblas.txt +++ b/backend/python/sentencetransformers/requirements-hipblas.txt @@ -1,5 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 torch accelerate -sentence-transformers==3.0.1 +sentence-transformers==3.1.0 transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-intel.txt b/backend/python/sentencetransformers/requirements-intel.txt index 806e3d47..5948910d 100644 --- a/backend/python/sentencetransformers/requirements-intel.txt +++ b/backend/python/sentencetransformers/requirements-intel.txt @@ -4,5 +4,5 @@ torch optimum[openvino] setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 accelerate -sentence-transformers==3.0.1 +sentence-transformers==3.1.0 transformers \ No newline at end of file From c866b77586f25340d98a9fbb2ad16e22d5e4d577 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 00:02:42 +0000 Subject: [PATCH 010/122] chore(deps): Bump llama-index from 0.11.7 to 0.11.9 in /examples/chainlit (#3567) chore(deps): Bump llama-index in /examples/chainlit Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.7 to 0.11.9. - [Release notes](https://github.com/run-llama/llama_index/releases) - [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md) - [Commits](https://github.com/run-llama/llama_index/compare/v0.11.7...v0.11.9) --- updated-dependencies: - dependency-name: llama-index dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/chainlit/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/chainlit/requirements.txt b/examples/chainlit/requirements.txt index 69212e28..df8bea7f 100644 --- a/examples/chainlit/requirements.txt +++ b/examples/chainlit/requirements.txt @@ -1,4 +1,4 @@ -llama_index==0.11.7 +llama_index==0.11.9 requests==2.32.3 weaviate_client==4.6.7 transformers From 42d6b9e0ccc75fd3ecfb6275b0fe50236fdfc9f1 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 00:11:15 +0000 Subject: [PATCH 011/122] chore(deps): Bump weaviate-client from 4.6.7 to 4.8.1 in /examples/chainlit (#3568) chore(deps): Bump weaviate-client in /examples/chainlit Bumps [weaviate-client](https://github.com/weaviate/weaviate-python-client) from 4.6.7 to 4.8.1. - [Release notes](https://github.com/weaviate/weaviate-python-client/releases) - [Changelog](https://github.com/weaviate/weaviate-python-client/blob/main/docs/changelog.rst) - [Commits](https://github.com/weaviate/weaviate-python-client/compare/v4.6.7...v4.8.1) --- updated-dependencies: - dependency-name: weaviate-client dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/chainlit/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/chainlit/requirements.txt b/examples/chainlit/requirements.txt index df8bea7f..1fe9356a 100644 --- a/examples/chainlit/requirements.txt +++ b/examples/chainlit/requirements.txt @@ -1,6 +1,6 @@ llama_index==0.11.9 requests==2.32.3 -weaviate_client==4.6.7 +weaviate_client==4.8.1 transformers torch chainlit From abc27e0dc49dfba0ef5436c08acf5c5959f354ea Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 00:51:55 +0000 Subject: [PATCH 012/122] chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/vall-e-x (#3570) chore(deps): Bump setuptools in /backend/python/vall-e-x Bumps [setuptools](https://github.com/pypa/setuptools) from 72.1.0 to 75.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v72.1.0...v75.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/vall-e-x/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/vall-e-x/requirements-intel.txt b/backend/python/vall-e-x/requirements-intel.txt index 6185314f..adbabeac 100644 --- a/backend/python/vall-e-x/requirements-intel.txt +++ b/backend/python/vall-e-x/requirements-intel.txt @@ -4,4 +4,4 @@ accelerate torch torchaudio optimum[openvino] -setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From 36e19928eb2ad4f4976454873f101112d131b564 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 01:14:39 +0000 Subject: [PATCH 013/122] chore(deps): Bump greenlet from 3.0.3 to 3.1.0 in /examples/langchain/langchainpy-localai-example (#3571) chore(deps): Bump greenlet Bumps [greenlet](https://github.com/python-greenlet/greenlet) from 3.0.3 to 3.1.0. - [Changelog](https://github.com/python-greenlet/greenlet/blob/master/CHANGES.rst) - [Commits](https://github.com/python-greenlet/greenlet/compare/3.0.3...3.1.0) --- updated-dependencies: - dependency-name: greenlet dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 75323005..1bd6b841 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -8,7 +8,7 @@ colorama==0.4.6 dataclasses-json==0.6.7 debugpy==1.8.2 frozenlist==1.4.1 -greenlet==3.0.3 +greenlet==3.1.0 idna==3.8 langchain==0.2.16 langchain-community==0.2.16 From 2394f7833fab174663231b722e5de964446d2cbf Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 02:28:05 +0000 Subject: [PATCH 014/122] chore(deps): Bump setuptools from 70.3.0 to 75.1.0 in /backend/python/diffusers (#3575) chore(deps): Bump setuptools in /backend/python/diffusers Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 75.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v75.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/diffusers/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/diffusers/requirements-intel.txt b/backend/python/diffusers/requirements-intel.txt index 1cc2e2a2..566278a8 100644 --- a/backend/python/diffusers/requirements-intel.txt +++ b/backend/python/diffusers/requirements-intel.txt @@ -3,7 +3,7 @@ intel-extension-for-pytorch torch torchvision optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 +setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406 diffusers opencv-python transformers From 06c83398624549fba12e5ec975c2c25a0e7e649a Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 02:32:33 +0000 Subject: [PATCH 015/122] chore(deps): Bump setuptools from 70.3.0 to 75.1.0 in /backend/python/bark (#3574) chore(deps): Bump setuptools in /backend/python/bark Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 75.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v75.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/bark/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/bark/requirements-intel.txt b/backend/python/bark/requirements-intel.txt index 9feb6eef..c0e4dcaa 100644 --- a/backend/python/bark/requirements-intel.txt +++ b/backend/python/bark/requirements-intel.txt @@ -3,6 +3,6 @@ intel-extension-for-pytorch torch torchaudio optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 +setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406 transformers accelerate \ No newline at end of file From a9a3a07c3bf22b2a3741471f6122876c65d8909a Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 03:24:30 +0000 Subject: [PATCH 016/122] chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/rerankers (#3578) chore(deps): Bump setuptools in /backend/python/rerankers Bumps [setuptools](https://github.com/pypa/setuptools) from 72.1.0 to 75.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v72.1.0...v75.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/rerankers/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/rerankers/requirements-intel.txt b/backend/python/rerankers/requirements-intel.txt index 1a39cf4f..e6bb4cc7 100644 --- a/backend/python/rerankers/requirements-intel.txt +++ b/backend/python/rerankers/requirements-intel.txt @@ -5,4 +5,4 @@ accelerate torch rerankers[transformers] optimum[openvino] -setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From db1159b6511e8fa09e594f9db0fec6ab4e142468 Mon Sep 17 00:00:00 2001 From: Dave Date: Mon, 16 Sep 2024 23:29:07 -0400 Subject: [PATCH 017/122] feat: auth v2 - supersedes #2894 (#3476) feat: auth v2 - supercedes #2894, metrics to follow later Signed-off-by: Dave Lee --- core/cli/run.go | 56 ++++++++++--------- core/config/application_config.go | 40 +++++++++++-- core/http/app.go | 49 +++++----------- core/http/middleware/auth.go | 93 +++++++++++++++++++++++++++++++ core/http/routes/elevenlabs.go | 7 +-- core/http/routes/jina.go | 3 +- core/http/routes/localai.go | 41 +++++++------- core/http/routes/openai.go | 89 +++++++++++++++-------------- core/http/routes/ui.go | 41 +++++++------- go.mod | 1 + go.sum | 2 + 11 files changed, 264 insertions(+), 158 deletions(-) create mode 100644 core/http/middleware/auth.go diff --git a/core/cli/run.go b/core/cli/run.go index 55ae0fd5..afb7204c 100644 --- a/core/cli/run.go +++ b/core/cli/run.go @@ -41,31 +41,34 @@ type RunCMD struct { Threads int `env:"LOCALAI_THREADS,THREADS" short:"t" help:"Number of threads used for parallel computation. Usage of the number of physical cores in the system is suggested" group:"performance"` ContextSize int `env:"LOCALAI_CONTEXT_SIZE,CONTEXT_SIZE" default:"512" help:"Default context size for models" group:"performance"` - Address string `env:"LOCALAI_ADDRESS,ADDRESS" default:":8080" help:"Bind address for the API server" group:"api"` - CORS bool `env:"LOCALAI_CORS,CORS" help:"" group:"api"` - CORSAllowOrigins string `env:"LOCALAI_CORS_ALLOW_ORIGINS,CORS_ALLOW_ORIGINS" group:"api"` - LibraryPath string `env:"LOCALAI_LIBRARY_PATH,LIBRARY_PATH" help:"Path to the library directory (for e.g. external libraries used by backends)" default:"/usr/share/local-ai/libs" group:"backends"` - CSRF bool `env:"LOCALAI_CSRF" help:"Enables fiber CSRF middleware" group:"api"` - UploadLimit int `env:"LOCALAI_UPLOAD_LIMIT,UPLOAD_LIMIT" default:"15" help:"Default upload-limit in MB" group:"api"` - APIKeys []string `env:"LOCALAI_API_KEY,API_KEY" help:"List of API Keys to enable API authentication. When this is set, all the requests must be authenticated with one of these API keys" group:"api"` - DisableWebUI bool `env:"LOCALAI_DISABLE_WEBUI,DISABLE_WEBUI" default:"false" help:"Disable webui" group:"api"` - DisablePredownloadScan bool `env:"LOCALAI_DISABLE_PREDOWNLOAD_SCAN" help:"If true, disables the best-effort security scanner before downloading any files." group:"hardening" default:"false"` - OpaqueErrors bool `env:"LOCALAI_OPAQUE_ERRORS" default:"false" help:"If true, all error responses are replaced with blank 500 errors. This is intended only for hardening against information leaks and is normally not recommended." group:"hardening"` - Peer2Peer bool `env:"LOCALAI_P2P,P2P" name:"p2p" default:"false" help:"Enable P2P mode" group:"p2p"` - Peer2PeerDHTInterval int `env:"LOCALAI_P2P_DHT_INTERVAL,P2P_DHT_INTERVAL" default:"360" name:"p2p-dht-interval" help:"Interval for DHT refresh (used during token generation)" group:"p2p"` - Peer2PeerOTPInterval int `env:"LOCALAI_P2P_OTP_INTERVAL,P2P_OTP_INTERVAL" default:"9000" name:"p2p-otp-interval" help:"Interval for OTP refresh (used during token generation)" group:"p2p"` - Peer2PeerToken string `env:"LOCALAI_P2P_TOKEN,P2P_TOKEN,TOKEN" name:"p2ptoken" help:"Token for P2P mode (optional)" group:"p2p"` - Peer2PeerNetworkID string `env:"LOCALAI_P2P_NETWORK_ID,P2P_NETWORK_ID" help:"Network ID for P2P mode, can be set arbitrarly by the user for grouping a set of instances" group:"p2p"` - ParallelRequests bool `env:"LOCALAI_PARALLEL_REQUESTS,PARALLEL_REQUESTS" help:"Enable backends to handle multiple requests in parallel if they support it (e.g.: llama.cpp or vllm)" group:"backends"` - SingleActiveBackend bool `env:"LOCALAI_SINGLE_ACTIVE_BACKEND,SINGLE_ACTIVE_BACKEND" help:"Allow only one backend to be run at a time" group:"backends"` - PreloadBackendOnly bool `env:"LOCALAI_PRELOAD_BACKEND_ONLY,PRELOAD_BACKEND_ONLY" default:"false" help:"Do not launch the API services, only the preloaded models / backends are started (useful for multi-node setups)" group:"backends"` - ExternalGRPCBackends []string `env:"LOCALAI_EXTERNAL_GRPC_BACKENDS,EXTERNAL_GRPC_BACKENDS" help:"A list of external grpc backends" group:"backends"` - EnableWatchdogIdle bool `env:"LOCALAI_WATCHDOG_IDLE,WATCHDOG_IDLE" default:"false" help:"Enable watchdog for stopping backends that are idle longer than the watchdog-idle-timeout" group:"backends"` - WatchdogIdleTimeout string `env:"LOCALAI_WATCHDOG_IDLE_TIMEOUT,WATCHDOG_IDLE_TIMEOUT" default:"15m" help:"Threshold beyond which an idle backend should be stopped" group:"backends"` - EnableWatchdogBusy bool `env:"LOCALAI_WATCHDOG_BUSY,WATCHDOG_BUSY" default:"false" help:"Enable watchdog for stopping backends that are busy longer than the watchdog-busy-timeout" group:"backends"` - WatchdogBusyTimeout string `env:"LOCALAI_WATCHDOG_BUSY_TIMEOUT,WATCHDOG_BUSY_TIMEOUT" default:"5m" help:"Threshold beyond which a busy backend should be stopped" group:"backends"` - Federated bool `env:"LOCALAI_FEDERATED,FEDERATED" help:"Enable federated instance" group:"federated"` - DisableGalleryEndpoint bool `env:"LOCALAI_DISABLE_GALLERY_ENDPOINT,DISABLE_GALLERY_ENDPOINT" help:"Disable the gallery endpoints" group:"api"` + Address string `env:"LOCALAI_ADDRESS,ADDRESS" default:":8080" help:"Bind address for the API server" group:"api"` + CORS bool `env:"LOCALAI_CORS,CORS" help:"" group:"api"` + CORSAllowOrigins string `env:"LOCALAI_CORS_ALLOW_ORIGINS,CORS_ALLOW_ORIGINS" group:"api"` + LibraryPath string `env:"LOCALAI_LIBRARY_PATH,LIBRARY_PATH" help:"Path to the library directory (for e.g. external libraries used by backends)" default:"/usr/share/local-ai/libs" group:"backends"` + CSRF bool `env:"LOCALAI_CSRF" help:"Enables fiber CSRF middleware" group:"api"` + UploadLimit int `env:"LOCALAI_UPLOAD_LIMIT,UPLOAD_LIMIT" default:"15" help:"Default upload-limit in MB" group:"api"` + APIKeys []string `env:"LOCALAI_API_KEY,API_KEY" help:"List of API Keys to enable API authentication. When this is set, all the requests must be authenticated with one of these API keys" group:"api"` + DisableWebUI bool `env:"LOCALAI_DISABLE_WEBUI,DISABLE_WEBUI" default:"false" help:"Disable webui" group:"api"` + DisablePredownloadScan bool `env:"LOCALAI_DISABLE_PREDOWNLOAD_SCAN" help:"If true, disables the best-effort security scanner before downloading any files." group:"hardening" default:"false"` + OpaqueErrors bool `env:"LOCALAI_OPAQUE_ERRORS" default:"false" help:"If true, all error responses are replaced with blank 500 errors. This is intended only for hardening against information leaks and is normally not recommended." group:"hardening"` + UseSubtleKeyComparison bool `env:"LOCALAI_SUBTLE_KEY_COMPARISON" default:"false" help:"If true, API Key validation comparisons will be performed using constant-time comparisons rather than simple equality. This trades off performance on each request for resiliancy against timing attacks." group:"hardening"` + DisableApiKeyRequirementForHttpGet bool `env:"LOCALAI_DISABLE_API_KEY_REQUIREMENT_FOR_HTTP_GET" default:"false" help:"If true, a valid API key is not required to issue GET requests to portions of the web ui. This should only be enabled in secure testing environments" group:"hardening"` + HttpGetExemptedEndpoints []string `env:"LOCALAI_HTTP_GET_EXEMPTED_ENDPOINTS" default:"^/$,^/browse/?$,^/talk/?$,^/p2p/?$,^/chat/?$,^/text2image/?$,^/tts/?$,^/static/.*$,^/swagger.*$" help:"If LOCALAI_DISABLE_API_KEY_REQUIREMENT_FOR_HTTP_GET is overriden to true, this is the list of endpoints to exempt. Only adjust this in case of a security incident or as a result of a personal security posture review" group:"hardening"` + Peer2Peer bool `env:"LOCALAI_P2P,P2P" name:"p2p" default:"false" help:"Enable P2P mode" group:"p2p"` + Peer2PeerDHTInterval int `env:"LOCALAI_P2P_DHT_INTERVAL,P2P_DHT_INTERVAL" default:"360" name:"p2p-dht-interval" help:"Interval for DHT refresh (used during token generation)" group:"p2p"` + Peer2PeerOTPInterval int `env:"LOCALAI_P2P_OTP_INTERVAL,P2P_OTP_INTERVAL" default:"9000" name:"p2p-otp-interval" help:"Interval for OTP refresh (used during token generation)" group:"p2p"` + Peer2PeerToken string `env:"LOCALAI_P2P_TOKEN,P2P_TOKEN,TOKEN" name:"p2ptoken" help:"Token for P2P mode (optional)" group:"p2p"` + Peer2PeerNetworkID string `env:"LOCALAI_P2P_NETWORK_ID,P2P_NETWORK_ID" help:"Network ID for P2P mode, can be set arbitrarly by the user for grouping a set of instances" group:"p2p"` + ParallelRequests bool `env:"LOCALAI_PARALLEL_REQUESTS,PARALLEL_REQUESTS" help:"Enable backends to handle multiple requests in parallel if they support it (e.g.: llama.cpp or vllm)" group:"backends"` + SingleActiveBackend bool `env:"LOCALAI_SINGLE_ACTIVE_BACKEND,SINGLE_ACTIVE_BACKEND" help:"Allow only one backend to be run at a time" group:"backends"` + PreloadBackendOnly bool `env:"LOCALAI_PRELOAD_BACKEND_ONLY,PRELOAD_BACKEND_ONLY" default:"false" help:"Do not launch the API services, only the preloaded models / backends are started (useful for multi-node setups)" group:"backends"` + ExternalGRPCBackends []string `env:"LOCALAI_EXTERNAL_GRPC_BACKENDS,EXTERNAL_GRPC_BACKENDS" help:"A list of external grpc backends" group:"backends"` + EnableWatchdogIdle bool `env:"LOCALAI_WATCHDOG_IDLE,WATCHDOG_IDLE" default:"false" help:"Enable watchdog for stopping backends that are idle longer than the watchdog-idle-timeout" group:"backends"` + WatchdogIdleTimeout string `env:"LOCALAI_WATCHDOG_IDLE_TIMEOUT,WATCHDOG_IDLE_TIMEOUT" default:"15m" help:"Threshold beyond which an idle backend should be stopped" group:"backends"` + EnableWatchdogBusy bool `env:"LOCALAI_WATCHDOG_BUSY,WATCHDOG_BUSY" default:"false" help:"Enable watchdog for stopping backends that are busy longer than the watchdog-busy-timeout" group:"backends"` + WatchdogBusyTimeout string `env:"LOCALAI_WATCHDOG_BUSY_TIMEOUT,WATCHDOG_BUSY_TIMEOUT" default:"5m" help:"Threshold beyond which a busy backend should be stopped" group:"backends"` + Federated bool `env:"LOCALAI_FEDERATED,FEDERATED" help:"Enable federated instance" group:"federated"` + DisableGalleryEndpoint bool `env:"LOCALAI_DISABLE_GALLERY_ENDPOINT,DISABLE_GALLERY_ENDPOINT" help:"Disable the gallery endpoints" group:"api"` } func (r *RunCMD) Run(ctx *cliContext.Context) error { @@ -97,6 +100,9 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error { config.WithModelsURL(append(r.Models, r.ModelArgs...)...), config.WithOpaqueErrors(r.OpaqueErrors), config.WithEnforcedPredownloadScans(!r.DisablePredownloadScan), + config.WithSubtleKeyComparison(r.UseSubtleKeyComparison), + config.WithDisableApiKeyRequirementForHttpGet(r.DisableApiKeyRequirementForHttpGet), + config.WithHttpGetExemptedEndpoints(r.HttpGetExemptedEndpoints), config.WithP2PNetworkID(r.Peer2PeerNetworkID), } diff --git a/core/config/application_config.go b/core/config/application_config.go index 947c4f13..afbf325f 100644 --- a/core/config/application_config.go +++ b/core/config/application_config.go @@ -4,6 +4,7 @@ import ( "context" "embed" "encoding/json" + "regexp" "time" "github.com/mudler/LocalAI/pkg/xsysinfo" @@ -16,7 +17,6 @@ type ApplicationConfig struct { ModelPath string LibPath string UploadLimitMB, Threads, ContextSize int - DisableWebUI bool F16 bool Debug bool ImageDir string @@ -31,11 +31,17 @@ type ApplicationConfig struct { PreloadModelsFromPath string CORSAllowOrigins string ApiKeys []string - EnforcePredownloadScans bool - OpaqueErrors bool P2PToken string P2PNetworkID string + DisableWebUI bool + EnforcePredownloadScans bool + OpaqueErrors bool + UseSubtleKeyComparison bool + DisableApiKeyRequirementForHttpGet bool + HttpGetExemptedEndpoints []*regexp.Regexp + DisableGalleryEndpoint bool + ModelLibraryURL string Galleries []Gallery @@ -57,8 +63,6 @@ type ApplicationConfig struct { ModelsURL []string WatchDogBusyTimeout, WatchDogIdleTimeout time.Duration - - DisableGalleryEndpoint bool } type AppOption func(*ApplicationConfig) @@ -327,6 +331,32 @@ func WithOpaqueErrors(opaque bool) AppOption { } } +func WithSubtleKeyComparison(subtle bool) AppOption { + return func(o *ApplicationConfig) { + o.UseSubtleKeyComparison = subtle + } +} + +func WithDisableApiKeyRequirementForHttpGet(required bool) AppOption { + return func(o *ApplicationConfig) { + o.DisableApiKeyRequirementForHttpGet = required + } +} + +func WithHttpGetExemptedEndpoints(endpoints []string) AppOption { + return func(o *ApplicationConfig) { + o.HttpGetExemptedEndpoints = []*regexp.Regexp{} + for _, epr := range endpoints { + r, err := regexp.Compile(epr) + if err == nil && r != nil { + o.HttpGetExemptedEndpoints = append(o.HttpGetExemptedEndpoints, r) + } else { + log.Warn().Err(err).Str("regex", epr).Msg("Error while compiling HTTP Get Exemption regex, skipping this entry.") + } + } + } +} + // ToConfigLoaderOptions returns a slice of ConfigLoader Option. // Some options defined at the application level are going to be passed as defaults for // all the configuration for the models. diff --git a/core/http/app.go b/core/http/app.go index 6eb9c956..fa9cd866 100644 --- a/core/http/app.go +++ b/core/http/app.go @@ -3,13 +3,15 @@ package http import ( "embed" "errors" + "fmt" "net/http" - "strings" + "github.com/dave-gray101/v2keyauth" "github.com/mudler/LocalAI/pkg/utils" "github.com/mudler/LocalAI/core/http/endpoints/localai" "github.com/mudler/LocalAI/core/http/endpoints/openai" + "github.com/mudler/LocalAI/core/http/middleware" "github.com/mudler/LocalAI/core/http/routes" "github.com/mudler/LocalAI/core/config" @@ -137,37 +139,14 @@ func App(cl *config.BackendConfigLoader, ml *model.ModelLoader, appConfig *confi }) } - // Auth middleware checking if API key is valid. If no API key is set, no auth is required. - auth := func(c *fiber.Ctx) error { - if len(appConfig.ApiKeys) == 0 { - return c.Next() - } - - if len(appConfig.ApiKeys) == 0 { - return c.Next() - } - - authHeader := readAuthHeader(c) - if authHeader == "" { - return c.Status(fiber.StatusUnauthorized).JSON(fiber.Map{"message": "Authorization header missing"}) - } - - // If it's a bearer token - authHeaderParts := strings.Split(authHeader, " ") - if len(authHeaderParts) != 2 || authHeaderParts[0] != "Bearer" { - return c.Status(fiber.StatusUnauthorized).JSON(fiber.Map{"message": "Invalid Authorization header format"}) - } - - apiKey := authHeaderParts[1] - for _, key := range appConfig.ApiKeys { - if apiKey == key { - return c.Next() - } - } - - return c.Status(fiber.StatusUnauthorized).JSON(fiber.Map{"message": "Invalid API key"}) + kaConfig, err := middleware.GetKeyAuthConfig(appConfig) + if err != nil || kaConfig == nil { + return nil, fmt.Errorf("failed to create key auth config: %w", err) } + // Auth is applied to _all_ endpoints. No exceptions. Filtering out endpoints to bypass is the role of the Filter property of the KeyAuth Configuration + app.Use(v2keyauth.New(*kaConfig)) + if appConfig.CORS { var c func(ctx *fiber.Ctx) error if appConfig.CORSAllowOrigins == "" { @@ -192,13 +171,13 @@ func App(cl *config.BackendConfigLoader, ml *model.ModelLoader, appConfig *confi galleryService := services.NewGalleryService(appConfig) galleryService.Start(appConfig.Context, cl) - routes.RegisterElevenLabsRoutes(app, cl, ml, appConfig, auth) - routes.RegisterLocalAIRoutes(app, cl, ml, appConfig, galleryService, auth) - routes.RegisterOpenAIRoutes(app, cl, ml, appConfig, auth) + routes.RegisterElevenLabsRoutes(app, cl, ml, appConfig) + routes.RegisterLocalAIRoutes(app, cl, ml, appConfig, galleryService) + routes.RegisterOpenAIRoutes(app, cl, ml, appConfig) if !appConfig.DisableWebUI { - routes.RegisterUIRoutes(app, cl, ml, appConfig, galleryService, auth) + routes.RegisterUIRoutes(app, cl, ml, appConfig, galleryService) } - routes.RegisterJINARoutes(app, cl, ml, appConfig, auth) + routes.RegisterJINARoutes(app, cl, ml, appConfig) httpFS := http.FS(embedDirStatic) diff --git a/core/http/middleware/auth.go b/core/http/middleware/auth.go new file mode 100644 index 00000000..bc8bcf80 --- /dev/null +++ b/core/http/middleware/auth.go @@ -0,0 +1,93 @@ +package middleware + +import ( + "crypto/subtle" + "errors" + + "github.com/dave-gray101/v2keyauth" + "github.com/gofiber/fiber/v2" + "github.com/gofiber/fiber/v2/middleware/keyauth" + "github.com/mudler/LocalAI/core/config" +) + +// This file contains the configuration generators and handler functions that are used along with the fiber/keyauth middleware +// Currently this requires an upstream patch - and feature patches are no longer accepted to v2 +// Therefore `dave-gray101/v2keyauth` contains the v2 backport of the middleware until v3 stabilizes and we migrate. + +func GetKeyAuthConfig(applicationConfig *config.ApplicationConfig) (*v2keyauth.Config, error) { + customLookup, err := v2keyauth.MultipleKeySourceLookup([]string{"header:Authorization", "header:x-api-key", "header:xi-api-key"}, keyauth.ConfigDefault.AuthScheme) + if err != nil { + return nil, err + } + + return &v2keyauth.Config{ + CustomKeyLookup: customLookup, + Next: getApiKeyRequiredFilterFunction(applicationConfig), + Validator: getApiKeyValidationFunction(applicationConfig), + ErrorHandler: getApiKeyErrorHandler(applicationConfig), + AuthScheme: "Bearer", + }, nil +} + +func getApiKeyErrorHandler(applicationConfig *config.ApplicationConfig) fiber.ErrorHandler { + return func(ctx *fiber.Ctx, err error) error { + if errors.Is(err, v2keyauth.ErrMissingOrMalformedAPIKey) { + if len(applicationConfig.ApiKeys) == 0 { + return ctx.Next() // if no keys are set up, any error we get here is not an error. + } + if applicationConfig.OpaqueErrors { + return ctx.SendStatus(403) + } + } + if applicationConfig.OpaqueErrors { + return ctx.SendStatus(500) + } + return err + } +} + +func getApiKeyValidationFunction(applicationConfig *config.ApplicationConfig) func(*fiber.Ctx, string) (bool, error) { + + if applicationConfig.UseSubtleKeyComparison { + return func(ctx *fiber.Ctx, apiKey string) (bool, error) { + if len(applicationConfig.ApiKeys) == 0 { + return true, nil // If no keys are setup, accept everything + } + for _, validKey := range applicationConfig.ApiKeys { + if subtle.ConstantTimeCompare([]byte(apiKey), []byte(validKey)) == 1 { + return true, nil + } + } + return false, v2keyauth.ErrMissingOrMalformedAPIKey + } + } + + return func(ctx *fiber.Ctx, apiKey string) (bool, error) { + if len(applicationConfig.ApiKeys) == 0 { + return true, nil // If no keys are setup, accept everything + } + for _, validKey := range applicationConfig.ApiKeys { + if apiKey == validKey { + return true, nil + } + } + return false, v2keyauth.ErrMissingOrMalformedAPIKey + } +} + +func getApiKeyRequiredFilterFunction(applicationConfig *config.ApplicationConfig) func(*fiber.Ctx) bool { + if applicationConfig.DisableApiKeyRequirementForHttpGet { + return func(c *fiber.Ctx) bool { + if c.Method() != "GET" { + return false + } + for _, rx := range applicationConfig.HttpGetExemptedEndpoints { + if rx.MatchString(c.Path()) { + return true + } + } + return false + } + } + return func(c *fiber.Ctx) bool { return false } +} \ No newline at end of file diff --git a/core/http/routes/elevenlabs.go b/core/http/routes/elevenlabs.go index b20dec75..73387c7b 100644 --- a/core/http/routes/elevenlabs.go +++ b/core/http/routes/elevenlabs.go @@ -10,12 +10,11 @@ import ( func RegisterElevenLabsRoutes(app *fiber.App, cl *config.BackendConfigLoader, ml *model.ModelLoader, - appConfig *config.ApplicationConfig, - auth func(*fiber.Ctx) error) { + appConfig *config.ApplicationConfig) { // Elevenlabs - app.Post("/v1/text-to-speech/:voice-id", auth, elevenlabs.TTSEndpoint(cl, ml, appConfig)) + app.Post("/v1/text-to-speech/:voice-id", elevenlabs.TTSEndpoint(cl, ml, appConfig)) - app.Post("/v1/sound-generation", auth, elevenlabs.SoundGenerationEndpoint(cl, ml, appConfig)) + app.Post("/v1/sound-generation", elevenlabs.SoundGenerationEndpoint(cl, ml, appConfig)) } diff --git a/core/http/routes/jina.go b/core/http/routes/jina.go index 92f29224..93125e6c 100644 --- a/core/http/routes/jina.go +++ b/core/http/routes/jina.go @@ -11,8 +11,7 @@ import ( func RegisterJINARoutes(app *fiber.App, cl *config.BackendConfigLoader, ml *model.ModelLoader, - appConfig *config.ApplicationConfig, - auth func(*fiber.Ctx) error) { + appConfig *config.ApplicationConfig) { // POST endpoint to mimic the reranking app.Post("/v1/rerank", jina.JINARerankEndpoint(cl, ml, appConfig)) diff --git a/core/http/routes/localai.go b/core/http/routes/localai.go index f85fa807..29fef378 100644 --- a/core/http/routes/localai.go +++ b/core/http/routes/localai.go @@ -15,33 +15,32 @@ func RegisterLocalAIRoutes(app *fiber.App, cl *config.BackendConfigLoader, ml *model.ModelLoader, appConfig *config.ApplicationConfig, - galleryService *services.GalleryService, - auth func(*fiber.Ctx) error) { + galleryService *services.GalleryService) { app.Get("/swagger/*", swagger.HandlerDefault) // default // LocalAI API endpoints if !appConfig.DisableGalleryEndpoint { modelGalleryEndpointService := localai.CreateModelGalleryEndpointService(appConfig.Galleries, appConfig.ModelPath, galleryService) - app.Post("/models/apply", auth, modelGalleryEndpointService.ApplyModelGalleryEndpoint()) - app.Post("/models/delete/:name", auth, modelGalleryEndpointService.DeleteModelGalleryEndpoint()) + app.Post("/models/apply", modelGalleryEndpointService.ApplyModelGalleryEndpoint()) + app.Post("/models/delete/:name", modelGalleryEndpointService.DeleteModelGalleryEndpoint()) - app.Get("/models/available", auth, modelGalleryEndpointService.ListModelFromGalleryEndpoint()) - app.Get("/models/galleries", auth, modelGalleryEndpointService.ListModelGalleriesEndpoint()) - app.Post("/models/galleries", auth, modelGalleryEndpointService.AddModelGalleryEndpoint()) - app.Delete("/models/galleries", auth, modelGalleryEndpointService.RemoveModelGalleryEndpoint()) - app.Get("/models/jobs/:uuid", auth, modelGalleryEndpointService.GetOpStatusEndpoint()) - app.Get("/models/jobs", auth, modelGalleryEndpointService.GetAllStatusEndpoint()) + app.Get("/models/available", modelGalleryEndpointService.ListModelFromGalleryEndpoint()) + app.Get("/models/galleries", modelGalleryEndpointService.ListModelGalleriesEndpoint()) + app.Post("/models/galleries", modelGalleryEndpointService.AddModelGalleryEndpoint()) + app.Delete("/models/galleries", modelGalleryEndpointService.RemoveModelGalleryEndpoint()) + app.Get("/models/jobs/:uuid", modelGalleryEndpointService.GetOpStatusEndpoint()) + app.Get("/models/jobs", modelGalleryEndpointService.GetAllStatusEndpoint()) } - app.Post("/tts", auth, localai.TTSEndpoint(cl, ml, appConfig)) + app.Post("/tts", localai.TTSEndpoint(cl, ml, appConfig)) // Stores sl := model.NewModelLoader("") - app.Post("/stores/set", auth, localai.StoresSetEndpoint(sl, appConfig)) - app.Post("/stores/delete", auth, localai.StoresDeleteEndpoint(sl, appConfig)) - app.Post("/stores/get", auth, localai.StoresGetEndpoint(sl, appConfig)) - app.Post("/stores/find", auth, localai.StoresFindEndpoint(sl, appConfig)) + app.Post("/stores/set", localai.StoresSetEndpoint(sl, appConfig)) + app.Post("/stores/delete", localai.StoresDeleteEndpoint(sl, appConfig)) + app.Post("/stores/get", localai.StoresGetEndpoint(sl, appConfig)) + app.Post("/stores/find", localai.StoresFindEndpoint(sl, appConfig)) // Kubernetes health checks ok := func(c *fiber.Ctx) error { @@ -51,20 +50,20 @@ func RegisterLocalAIRoutes(app *fiber.App, app.Get("/healthz", ok) app.Get("/readyz", ok) - app.Get("/metrics", auth, localai.LocalAIMetricsEndpoint()) + app.Get("/metrics", localai.LocalAIMetricsEndpoint()) // Experimental Backend Statistics Module backendMonitorService := services.NewBackendMonitorService(ml, cl, appConfig) // Split out for now - app.Get("/backend/monitor", auth, localai.BackendMonitorEndpoint(backendMonitorService)) - app.Post("/backend/shutdown", auth, localai.BackendShutdownEndpoint(backendMonitorService)) + app.Get("/backend/monitor", localai.BackendMonitorEndpoint(backendMonitorService)) + app.Post("/backend/shutdown", localai.BackendShutdownEndpoint(backendMonitorService)) // p2p if p2p.IsP2PEnabled() { - app.Get("/api/p2p", auth, localai.ShowP2PNodes(appConfig)) - app.Get("/api/p2p/token", auth, localai.ShowP2PToken(appConfig)) + app.Get("/api/p2p", localai.ShowP2PNodes(appConfig)) + app.Get("/api/p2p/token", localai.ShowP2PToken(appConfig)) } - app.Get("/version", auth, func(c *fiber.Ctx) error { + app.Get("/version", func(c *fiber.Ctx) error { return c.JSON(struct { Version string `json:"version"` }{Version: internal.PrintableVersion()}) diff --git a/core/http/routes/openai.go b/core/http/routes/openai.go index e190bc6d..081daf70 100644 --- a/core/http/routes/openai.go +++ b/core/http/routes/openai.go @@ -11,66 +11,65 @@ import ( func RegisterOpenAIRoutes(app *fiber.App, cl *config.BackendConfigLoader, ml *model.ModelLoader, - appConfig *config.ApplicationConfig, - auth func(*fiber.Ctx) error) { + appConfig *config.ApplicationConfig) { // openAI compatible API endpoint // chat - app.Post("/v1/chat/completions", auth, openai.ChatEndpoint(cl, ml, appConfig)) - app.Post("/chat/completions", auth, openai.ChatEndpoint(cl, ml, appConfig)) + app.Post("/v1/chat/completions", openai.ChatEndpoint(cl, ml, appConfig)) + app.Post("/chat/completions", openai.ChatEndpoint(cl, ml, appConfig)) // edit - app.Post("/v1/edits", auth, openai.EditEndpoint(cl, ml, appConfig)) - app.Post("/edits", auth, openai.EditEndpoint(cl, ml, appConfig)) + app.Post("/v1/edits", openai.EditEndpoint(cl, ml, appConfig)) + app.Post("/edits", openai.EditEndpoint(cl, ml, appConfig)) // assistant - app.Get("/v1/assistants", auth, openai.ListAssistantsEndpoint(cl, ml, appConfig)) - app.Get("/assistants", auth, openai.ListAssistantsEndpoint(cl, ml, appConfig)) - app.Post("/v1/assistants", auth, openai.CreateAssistantEndpoint(cl, ml, appConfig)) - app.Post("/assistants", auth, openai.CreateAssistantEndpoint(cl, ml, appConfig)) - app.Delete("/v1/assistants/:assistant_id", auth, openai.DeleteAssistantEndpoint(cl, ml, appConfig)) - app.Delete("/assistants/:assistant_id", auth, openai.DeleteAssistantEndpoint(cl, ml, appConfig)) - app.Get("/v1/assistants/:assistant_id", auth, openai.GetAssistantEndpoint(cl, ml, appConfig)) - app.Get("/assistants/:assistant_id", auth, openai.GetAssistantEndpoint(cl, ml, appConfig)) - app.Post("/v1/assistants/:assistant_id", auth, openai.ModifyAssistantEndpoint(cl, ml, appConfig)) - app.Post("/assistants/:assistant_id", auth, openai.ModifyAssistantEndpoint(cl, ml, appConfig)) - app.Get("/v1/assistants/:assistant_id/files", auth, openai.ListAssistantFilesEndpoint(cl, ml, appConfig)) - app.Get("/assistants/:assistant_id/files", auth, openai.ListAssistantFilesEndpoint(cl, ml, appConfig)) - app.Post("/v1/assistants/:assistant_id/files", auth, openai.CreateAssistantFileEndpoint(cl, ml, appConfig)) - app.Post("/assistants/:assistant_id/files", auth, openai.CreateAssistantFileEndpoint(cl, ml, appConfig)) - app.Delete("/v1/assistants/:assistant_id/files/:file_id", auth, openai.DeleteAssistantFileEndpoint(cl, ml, appConfig)) - app.Delete("/assistants/:assistant_id/files/:file_id", auth, openai.DeleteAssistantFileEndpoint(cl, ml, appConfig)) - app.Get("/v1/assistants/:assistant_id/files/:file_id", auth, openai.GetAssistantFileEndpoint(cl, ml, appConfig)) - app.Get("/assistants/:assistant_id/files/:file_id", auth, openai.GetAssistantFileEndpoint(cl, ml, appConfig)) + app.Get("/v1/assistants", openai.ListAssistantsEndpoint(cl, ml, appConfig)) + app.Get("/assistants", openai.ListAssistantsEndpoint(cl, ml, appConfig)) + app.Post("/v1/assistants", openai.CreateAssistantEndpoint(cl, ml, appConfig)) + app.Post("/assistants", openai.CreateAssistantEndpoint(cl, ml, appConfig)) + app.Delete("/v1/assistants/:assistant_id", openai.DeleteAssistantEndpoint(cl, ml, appConfig)) + app.Delete("/assistants/:assistant_id", openai.DeleteAssistantEndpoint(cl, ml, appConfig)) + app.Get("/v1/assistants/:assistant_id", openai.GetAssistantEndpoint(cl, ml, appConfig)) + app.Get("/assistants/:assistant_id", openai.GetAssistantEndpoint(cl, ml, appConfig)) + app.Post("/v1/assistants/:assistant_id", openai.ModifyAssistantEndpoint(cl, ml, appConfig)) + app.Post("/assistants/:assistant_id", openai.ModifyAssistantEndpoint(cl, ml, appConfig)) + app.Get("/v1/assistants/:assistant_id/files", openai.ListAssistantFilesEndpoint(cl, ml, appConfig)) + app.Get("/assistants/:assistant_id/files", openai.ListAssistantFilesEndpoint(cl, ml, appConfig)) + app.Post("/v1/assistants/:assistant_id/files", openai.CreateAssistantFileEndpoint(cl, ml, appConfig)) + app.Post("/assistants/:assistant_id/files", openai.CreateAssistantFileEndpoint(cl, ml, appConfig)) + app.Delete("/v1/assistants/:assistant_id/files/:file_id", openai.DeleteAssistantFileEndpoint(cl, ml, appConfig)) + app.Delete("/assistants/:assistant_id/files/:file_id", openai.DeleteAssistantFileEndpoint(cl, ml, appConfig)) + app.Get("/v1/assistants/:assistant_id/files/:file_id", openai.GetAssistantFileEndpoint(cl, ml, appConfig)) + app.Get("/assistants/:assistant_id/files/:file_id", openai.GetAssistantFileEndpoint(cl, ml, appConfig)) // files - app.Post("/v1/files", auth, openai.UploadFilesEndpoint(cl, appConfig)) - app.Post("/files", auth, openai.UploadFilesEndpoint(cl, appConfig)) - app.Get("/v1/files", auth, openai.ListFilesEndpoint(cl, appConfig)) - app.Get("/files", auth, openai.ListFilesEndpoint(cl, appConfig)) - app.Get("/v1/files/:file_id", auth, openai.GetFilesEndpoint(cl, appConfig)) - app.Get("/files/:file_id", auth, openai.GetFilesEndpoint(cl, appConfig)) - app.Delete("/v1/files/:file_id", auth, openai.DeleteFilesEndpoint(cl, appConfig)) - app.Delete("/files/:file_id", auth, openai.DeleteFilesEndpoint(cl, appConfig)) - app.Get("/v1/files/:file_id/content", auth, openai.GetFilesContentsEndpoint(cl, appConfig)) - app.Get("/files/:file_id/content", auth, openai.GetFilesContentsEndpoint(cl, appConfig)) + app.Post("/v1/files", openai.UploadFilesEndpoint(cl, appConfig)) + app.Post("/files", openai.UploadFilesEndpoint(cl, appConfig)) + app.Get("/v1/files", openai.ListFilesEndpoint(cl, appConfig)) + app.Get("/files", openai.ListFilesEndpoint(cl, appConfig)) + app.Get("/v1/files/:file_id", openai.GetFilesEndpoint(cl, appConfig)) + app.Get("/files/:file_id", openai.GetFilesEndpoint(cl, appConfig)) + app.Delete("/v1/files/:file_id", openai.DeleteFilesEndpoint(cl, appConfig)) + app.Delete("/files/:file_id", openai.DeleteFilesEndpoint(cl, appConfig)) + app.Get("/v1/files/:file_id/content", openai.GetFilesContentsEndpoint(cl, appConfig)) + app.Get("/files/:file_id/content", openai.GetFilesContentsEndpoint(cl, appConfig)) // completion - app.Post("/v1/completions", auth, openai.CompletionEndpoint(cl, ml, appConfig)) - app.Post("/completions", auth, openai.CompletionEndpoint(cl, ml, appConfig)) - app.Post("/v1/engines/:model/completions", auth, openai.CompletionEndpoint(cl, ml, appConfig)) + app.Post("/v1/completions", openai.CompletionEndpoint(cl, ml, appConfig)) + app.Post("/completions", openai.CompletionEndpoint(cl, ml, appConfig)) + app.Post("/v1/engines/:model/completions", openai.CompletionEndpoint(cl, ml, appConfig)) // embeddings - app.Post("/v1/embeddings", auth, openai.EmbeddingsEndpoint(cl, ml, appConfig)) - app.Post("/embeddings", auth, openai.EmbeddingsEndpoint(cl, ml, appConfig)) - app.Post("/v1/engines/:model/embeddings", auth, openai.EmbeddingsEndpoint(cl, ml, appConfig)) + app.Post("/v1/embeddings", openai.EmbeddingsEndpoint(cl, ml, appConfig)) + app.Post("/embeddings", openai.EmbeddingsEndpoint(cl, ml, appConfig)) + app.Post("/v1/engines/:model/embeddings", openai.EmbeddingsEndpoint(cl, ml, appConfig)) // audio - app.Post("/v1/audio/transcriptions", auth, openai.TranscriptEndpoint(cl, ml, appConfig)) - app.Post("/v1/audio/speech", auth, localai.TTSEndpoint(cl, ml, appConfig)) + app.Post("/v1/audio/transcriptions", openai.TranscriptEndpoint(cl, ml, appConfig)) + app.Post("/v1/audio/speech", localai.TTSEndpoint(cl, ml, appConfig)) // images - app.Post("/v1/images/generations", auth, openai.ImageEndpoint(cl, ml, appConfig)) + app.Post("/v1/images/generations", openai.ImageEndpoint(cl, ml, appConfig)) if appConfig.ImageDir != "" { app.Static("/generated-images", appConfig.ImageDir) @@ -81,6 +80,6 @@ func RegisterOpenAIRoutes(app *fiber.App, } // List models - app.Get("/v1/models", auth, openai.ListModelsEndpoint(cl, ml)) - app.Get("/models", auth, openai.ListModelsEndpoint(cl, ml)) + app.Get("/v1/models", openai.ListModelsEndpoint(cl, ml)) + app.Get("/models", openai.ListModelsEndpoint(cl, ml)) } diff --git a/core/http/routes/ui.go b/core/http/routes/ui.go index 6dfb3f43..7b2c6ae7 100644 --- a/core/http/routes/ui.go +++ b/core/http/routes/ui.go @@ -59,8 +59,7 @@ func RegisterUIRoutes(app *fiber.App, cl *config.BackendConfigLoader, ml *model.ModelLoader, appConfig *config.ApplicationConfig, - galleryService *services.GalleryService, - auth func(*fiber.Ctx) error) { + galleryService *services.GalleryService) { // keeps the state of models that are being installed from the UI var processingModels = NewModelOpCache() @@ -85,10 +84,10 @@ func RegisterUIRoutes(app *fiber.App, return processingModelsData, taskTypes } - app.Get("/", auth, localai.WelcomeEndpoint(appConfig, cl, ml, modelStatus)) + app.Get("/", localai.WelcomeEndpoint(appConfig, cl, ml, modelStatus)) if p2p.IsP2PEnabled() { - app.Get("/p2p", auth, func(c *fiber.Ctx) error { + app.Get("/p2p", func(c *fiber.Ctx) error { summary := fiber.Map{ "Title": "LocalAI - P2P dashboard", "Version": internal.PrintableVersion(), @@ -104,17 +103,17 @@ func RegisterUIRoutes(app *fiber.App, }) /* show nodes live! */ - app.Get("/p2p/ui/workers", auth, func(c *fiber.Ctx) error { + app.Get("/p2p/ui/workers", func(c *fiber.Ctx) error { return c.SendString(elements.P2PNodeBoxes(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.WorkerID)))) }) - app.Get("/p2p/ui/workers-federation", auth, func(c *fiber.Ctx) error { + app.Get("/p2p/ui/workers-federation", func(c *fiber.Ctx) error { return c.SendString(elements.P2PNodeBoxes(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.FederatedID)))) }) - app.Get("/p2p/ui/workers-stats", auth, func(c *fiber.Ctx) error { + app.Get("/p2p/ui/workers-stats", func(c *fiber.Ctx) error { return c.SendString(elements.P2PNodeStats(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.WorkerID)))) }) - app.Get("/p2p/ui/workers-federation-stats", auth, func(c *fiber.Ctx) error { + app.Get("/p2p/ui/workers-federation-stats", func(c *fiber.Ctx) error { return c.SendString(elements.P2PNodeStats(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.FederatedID)))) }) } @@ -122,7 +121,7 @@ func RegisterUIRoutes(app *fiber.App, if !appConfig.DisableGalleryEndpoint { // Show the Models page (all models) - app.Get("/browse", auth, func(c *fiber.Ctx) error { + app.Get("/browse", func(c *fiber.Ctx) error { term := c.Query("term") models, _ := gallery.AvailableGalleryModels(appConfig.Galleries, appConfig.ModelPath) @@ -167,7 +166,7 @@ func RegisterUIRoutes(app *fiber.App, // Show the models, filtered from the user input // https://htmx.org/examples/active-search/ - app.Post("/browse/search/models", auth, func(c *fiber.Ctx) error { + app.Post("/browse/search/models", func(c *fiber.Ctx) error { form := struct { Search string `form:"search"` }{} @@ -188,7 +187,7 @@ func RegisterUIRoutes(app *fiber.App, // This route is used when the "Install" button is pressed, we submit here a new job to the gallery service // https://htmx.org/examples/progress-bar/ - app.Post("/browse/install/model/:id", auth, func(c *fiber.Ctx) error { + app.Post("/browse/install/model/:id", func(c *fiber.Ctx) error { galleryID := strings.Clone(c.Params("id")) // note: strings.Clone is required for multiple requests! log.Debug().Msgf("UI job submitted to install : %+v\n", galleryID) @@ -215,7 +214,7 @@ func RegisterUIRoutes(app *fiber.App, // This route is used when the "Install" button is pressed, we submit here a new job to the gallery service // https://htmx.org/examples/progress-bar/ - app.Post("/browse/delete/model/:id", auth, func(c *fiber.Ctx) error { + app.Post("/browse/delete/model/:id", func(c *fiber.Ctx) error { galleryID := strings.Clone(c.Params("id")) // note: strings.Clone is required for multiple requests! log.Debug().Msgf("UI job submitted to delete : %+v\n", galleryID) var galleryName = galleryID @@ -255,7 +254,7 @@ func RegisterUIRoutes(app *fiber.App, // Display the job current progress status // If the job is done, we trigger the /browse/job/:uid route // https://htmx.org/examples/progress-bar/ - app.Get("/browse/job/progress/:uid", auth, func(c *fiber.Ctx) error { + app.Get("/browse/job/progress/:uid", func(c *fiber.Ctx) error { jobUID := strings.Clone(c.Params("uid")) // note: strings.Clone is required for multiple requests! status := galleryService.GetStatus(jobUID) @@ -279,7 +278,7 @@ func RegisterUIRoutes(app *fiber.App, // this route is hit when the job is done, and we display the // final state (for now just displays "Installation completed") - app.Get("/browse/job/:uid", auth, func(c *fiber.Ctx) error { + app.Get("/browse/job/:uid", func(c *fiber.Ctx) error { jobUID := strings.Clone(c.Params("uid")) // note: strings.Clone is required for multiple requests! status := galleryService.GetStatus(jobUID) @@ -303,7 +302,7 @@ func RegisterUIRoutes(app *fiber.App, } // Show the Chat page - app.Get("/chat/:model", auth, func(c *fiber.Ctx) error { + app.Get("/chat/:model", func(c *fiber.Ctx) error { backendConfigs, _ := services.ListModels(cl, ml, "", true) summary := fiber.Map{ @@ -318,7 +317,7 @@ func RegisterUIRoutes(app *fiber.App, return c.Render("views/chat", summary) }) - app.Get("/talk/", auth, func(c *fiber.Ctx) error { + app.Get("/talk/", func(c *fiber.Ctx) error { backendConfigs, _ := services.ListModels(cl, ml, "", true) if len(backendConfigs) == 0 { @@ -338,7 +337,7 @@ func RegisterUIRoutes(app *fiber.App, return c.Render("views/talk", summary) }) - app.Get("/chat/", auth, func(c *fiber.Ctx) error { + app.Get("/chat/", func(c *fiber.Ctx) error { backendConfigs, _ := services.ListModels(cl, ml, "", true) @@ -359,7 +358,7 @@ func RegisterUIRoutes(app *fiber.App, return c.Render("views/chat", summary) }) - app.Get("/text2image/:model", auth, func(c *fiber.Ctx) error { + app.Get("/text2image/:model", func(c *fiber.Ctx) error { backendConfigs := cl.GetAllBackendConfigs() summary := fiber.Map{ @@ -374,7 +373,7 @@ func RegisterUIRoutes(app *fiber.App, return c.Render("views/text2image", summary) }) - app.Get("/text2image/", auth, func(c *fiber.Ctx) error { + app.Get("/text2image/", func(c *fiber.Ctx) error { backendConfigs := cl.GetAllBackendConfigs() @@ -395,7 +394,7 @@ func RegisterUIRoutes(app *fiber.App, return c.Render("views/text2image", summary) }) - app.Get("/tts/:model", auth, func(c *fiber.Ctx) error { + app.Get("/tts/:model", func(c *fiber.Ctx) error { backendConfigs := cl.GetAllBackendConfigs() summary := fiber.Map{ @@ -410,7 +409,7 @@ func RegisterUIRoutes(app *fiber.App, return c.Render("views/tts", summary) }) - app.Get("/tts/", auth, func(c *fiber.Ctx) error { + app.Get("/tts/", func(c *fiber.Ctx) error { backendConfigs := cl.GetAllBackendConfigs() diff --git a/go.mod b/go.mod index 57202ad2..a3359abf 100644 --- a/go.mod +++ b/go.mod @@ -74,6 +74,7 @@ require ( cloud.google.com/go/auth/oauth2adapt v0.2.2 // indirect cloud.google.com/go/compute/metadata v0.3.0 // indirect github.com/cpuguy83/go-md2man/v2 v2.0.4 // indirect + github.com/dave-gray101/v2keyauth v0.0.0-20240624150259-c45d584d25e2 // indirect github.com/envoyproxy/protoc-gen-validate v1.0.4 // indirect github.com/felixge/httpsnoop v1.0.4 // indirect github.com/go-task/slim-sprig/v3 v3.0.0 // indirect diff --git a/go.sum b/go.sum index ab64b84a..1dd44a5b 100644 --- a/go.sum +++ b/go.sum @@ -110,6 +110,8 @@ github.com/creachadair/otp v0.4.2 h1:ngNMaD6Tzd7UUNRFyed7ykZFn/Wr5sSs5ffqZWm9pu8 github.com/creachadair/otp v0.4.2/go.mod h1:DqV9hJyUbcUme0pooYfiFvvMe72Aua5sfhNzwfZvk40= github.com/creack/pty v1.1.18 h1:n56/Zwd5o6whRC5PMGretI4IdRLlmBXYNjScPaBgsbY= github.com/creack/pty v1.1.18/go.mod h1:MOBLtS5ELjhRRrroQr9kyvTxUAFNvYEK993ew/Vr4O4= +github.com/dave-gray101/v2keyauth v0.0.0-20240624150259-c45d584d25e2 h1:flLYmnQFZNo04x2NPehMbf30m7Pli57xwZ0NFqR/hb0= +github.com/dave-gray101/v2keyauth v0.0.0-20240624150259-c45d584d25e2/go.mod h1:NtWqRzAp/1tw+twkW8uuBenEVVYndEAZACWU3F3xdoQ= github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= From e95cb8eaacdac6426c085197ec5acf790206c042 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 03:33:52 +0000 Subject: [PATCH 018/122] chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in /backend/python/transformers (#3579) chore(deps): Bump setuptools in /backend/python/transformers Bumps [setuptools](https://github.com/pypa/setuptools) from 69.5.1 to 75.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v69.5.1...v75.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/transformers/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt index b19c59c0..1b7ebda5 100644 --- a/backend/python/transformers/requirements.txt +++ b/backend/python/transformers/requirements.txt @@ -1,4 +1,4 @@ grpcio==1.66.1 protobuf certifi -setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From f4b1bd8f6d70365e99320e52119cb7ed577b63c9 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 03:41:01 +0000 Subject: [PATCH 019/122] chore(deps): Bump setuptools from 70.3.0 to 75.1.0 in /backend/python/vllm (#3580) chore(deps): Bump setuptools in /backend/python/vllm Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 75.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v75.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/vllm/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/vllm/requirements-intel.txt b/backend/python/vllm/requirements-intel.txt index 7903282e..1f82c46e 100644 --- a/backend/python/vllm/requirements-intel.txt +++ b/backend/python/vllm/requirements-intel.txt @@ -4,4 +4,4 @@ accelerate torch transformers optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From 0e4e101101e92cd6b2451cf71a2f85a880468183 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Tue, 17 Sep 2024 05:52:15 +0200 Subject: [PATCH 020/122] chore: :arrow_up: Update ggerganov/llama.cpp to `23e0d70bacaaca1429d365a44aa9e7434f17823b` (#3581) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index e4d5b22c..f9fa5476 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=6262d13e0b2da91f230129a93a996609a2f5a2f2 +CPPLLAMA_VERSION?=23e0d70bacaaca1429d365a44aa9e7434f17823b # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From d0f2bf318103f631686c648d6bb6a299bca15976 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 17 Sep 2024 06:50:57 +0200 Subject: [PATCH 021/122] fix(shutdown): do not shutdown immediately busy backends (#3543) * fix(shutdown): do not shutdown immediately busy backends Signed-off-by: Ettore Di Giacinto * chore(refactor): avoid duplicate functions Signed-off-by: Ettore Di Giacinto * fix: multiplicative backoff for shutdown (#3547) * multiplicative backoff for shutdown Rather than always retry every two seconds, back off the shutdown attempt rate? Signed-off-by: Dave * Update loader.go Signed-off-by: Dave * add clamp of 2 minutes Signed-off-by: Dave Lee --------- Signed-off-by: Dave Signed-off-by: Dave Lee --------- Signed-off-by: Ettore Di Giacinto Signed-off-by: Dave Signed-off-by: Dave Lee Co-authored-by: Dave --- pkg/model/loader.go | 24 +++++++++++++++++------- pkg/model/process.go | 17 +++++++++-------- 2 files changed, 26 insertions(+), 15 deletions(-) diff --git a/pkg/model/loader.go b/pkg/model/loader.go index 90fda35f..b9865f73 100644 --- a/pkg/model/loader.go +++ b/pkg/model/loader.go @@ -69,6 +69,8 @@ var knownModelsNameSuffixToSkip []string = []string{ ".tar.gz", } +const retryTimeout = time.Duration(2 * time.Minute) + func (ml *ModelLoader) ListFilesInModelPath() ([]string, error) { files, err := os.ReadDir(ml.ModelPath) if err != nil { @@ -146,15 +148,23 @@ func (ml *ModelLoader) ShutdownModel(modelName string) error { ml.mu.Lock() defer ml.mu.Unlock() - return ml.stopModel(modelName) -} - -func (ml *ModelLoader) stopModel(modelName string) error { - defer ml.deleteProcess(modelName) - if _, ok := ml.models[modelName]; !ok { + _, ok := ml.models[modelName] + if !ok { return fmt.Errorf("model %s not found", modelName) } - return nil + + retries := 1 + for ml.models[modelName].GRPC(false, ml.wd).IsBusy() { + log.Debug().Msgf("%s busy. Waiting.", modelName) + dur := time.Duration(retries*2) * time.Second + if dur > retryTimeout { + dur = retryTimeout + } + time.Sleep(dur) + retries++ + } + + return ml.deleteProcess(modelName) } func (ml *ModelLoader) CheckIsLoaded(s string) *Model { diff --git a/pkg/model/process.go b/pkg/model/process.go index 5b751de8..50afbb1c 100644 --- a/pkg/model/process.go +++ b/pkg/model/process.go @@ -18,15 +18,16 @@ import ( func (ml *ModelLoader) StopAllExcept(s string) error { return ml.StopGRPC(func(id string, p *process.Process) bool { - if id != s { - for ml.models[id].GRPC(false, ml.wd).IsBusy() { - log.Debug().Msgf("%s busy. Waiting.", id) - time.Sleep(2 * time.Second) - } - log.Debug().Msgf("[single-backend] Stopping %s", id) - return true + if id == s { + return false } - return false + + for ml.models[id].GRPC(false, ml.wd).IsBusy() { + log.Debug().Msgf("%s busy. Waiting.", id) + time.Sleep(2 * time.Second) + } + log.Debug().Msgf("[single-backend] Stopping %s", id) + return true }) } From 22247ad92c65818d6fb751a2f9998b565190db7f Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 05:50:31 +0000 Subject: [PATCH 022/122] chore(deps): Bump langchain from 0.2.16 to 0.3.0 in /examples/langchain-chroma (#3557) chore(deps): Bump langchain in /examples/langchain-chroma Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.16 to 0.3.0. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.16...langchain==0.3.0) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 3edb570c..4884d4aa 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ -langchain==0.2.16 +langchain==0.3.0 openai==1.45.1 chromadb==0.5.5 llama-index==0.11.7 \ No newline at end of file From 4a4e44bf5559f2eac49df2c1135f39ad6d70300f Mon Sep 17 00:00:00 2001 From: Alexander Izotov <93216976+Nyralei@users.noreply.github.com> Date: Tue, 17 Sep 2024 08:52:37 +0300 Subject: [PATCH 023/122] feat: allow setting trust_remote_code for sentencetransformers backend (#3552) Allow setting trust_remote_code for SentenceTransformers backend Signed-off-by: Nyralei <93216976+Nyralei@users.noreply.github.com> --- backend/python/sentencetransformers/backend.py | 2 +- backend/python/sentencetransformers/requirements.txt | 4 +++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/backend/python/sentencetransformers/backend.py b/backend/python/sentencetransformers/backend.py index 905015e1..2a20bf60 100755 --- a/backend/python/sentencetransformers/backend.py +++ b/backend/python/sentencetransformers/backend.py @@ -55,7 +55,7 @@ class BackendServicer(backend_pb2_grpc.BackendServicer): """ model_name = request.Model try: - self.model = SentenceTransformer(model_name) + self.model = SentenceTransformer(model_name, trust_remote_code=request.TrustRemoteCode) except Exception as err: return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}") diff --git a/backend/python/sentencetransformers/requirements.txt b/backend/python/sentencetransformers/requirements.txt index 8e1b0195..b9cb6061 100644 --- a/backend/python/sentencetransformers/requirements.txt +++ b/backend/python/sentencetransformers/requirements.txt @@ -1,3 +1,5 @@ grpcio==1.66.1 protobuf -certifi \ No newline at end of file +certifi +datasets +einops \ No newline at end of file From 46fd4ff6db3aedaec3579872aa35d47973417b0a Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 06:19:52 +0000 Subject: [PATCH 024/122] chore(deps): Bump openai from 1.44.0 to 1.45.1 in /examples/functions (#3560) Bumps [openai](https://github.com/openai/openai-python) from 1.44.0 to 1.45.1. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.44.0...v1.45.1) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/functions/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt index 9dd6818f..670090d3 100644 --- a/examples/functions/requirements.txt +++ b/examples/functions/requirements.txt @@ -1,2 +1,2 @@ langchain==0.3.0 -openai==1.44.0 +openai==1.45.1 From 075e5015c0ff0ca1010d5bba11a774c1564a8795 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 17 Sep 2024 09:06:07 +0200 Subject: [PATCH 025/122] Revert "chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in /backend/python/transformers" (#3586) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Revert "chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in /backend/python…" This reverts commit e95cb8eaacdac6426c085197ec5acf790206c042. --- backend/python/transformers/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt index 1b7ebda5..b19c59c0 100644 --- a/backend/python/transformers/requirements.txt +++ b/backend/python/transformers/requirements.txt @@ -1,4 +1,4 @@ grpcio==1.66.1 protobuf certifi -setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From 92136a5d342993bdc8e0a26d5498b2e65ce9d26e Mon Sep 17 00:00:00 2001 From: Dave Date: Tue, 17 Sep 2024 03:23:58 -0400 Subject: [PATCH 026/122] fix: `gallery/index.yaml` comment spacing (#3585) extremely minor fix: add a space to index.yaml for the scanner Signed-off-by: Dave Lee --- gallery/index.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index 5e47d31c..229697bb 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1281,7 +1281,7 @@ - !!merge <<: *mistral03 name: "mn-12b-lyra-v4-iq-imatrix" icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/dVoru83WOpwVjMlgZ_xhA.png - #chatml + # chatml url: "github:mudler/LocalAI/gallery/chatml.yaml@master" urls: - https://huggingface.co/Lewdiculous/MN-12B-Lyra-v4-GGUF-IQ-Imatrix From 504962938127a04590e2e2383b2d5933ef3b48fd Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 10:24:01 +0200 Subject: [PATCH 027/122] chore(deps): Bump langchain from 0.2.16 to 0.3.0 in /examples/langchain/langchainpy-localai-example (#3577) chore(deps): Bump langchain Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.16 to 0.3.0. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.16...langchain==0.3.0) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 1bd6b841..213b4e2f 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -10,7 +10,7 @@ debugpy==1.8.2 frozenlist==1.4.1 greenlet==3.1.0 idna==3.8 -langchain==0.2.16 +langchain==0.3.0 langchain-community==0.2.16 marshmallow==3.22.0 marshmallow-enum==1.5.1 From 8826ca93b3b23d2d9333856b136bc606e92710ae Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 10:24:14 +0200 Subject: [PATCH 028/122] chore(deps): Bump openai from 1.44.0 to 1.45.1 in /examples/langchain/langchainpy-localai-example (#3573) chore(deps): Bump openai Bumps [openai](https://github.com/openai/openai-python) from 1.44.0 to 1.45.1. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.44.0...v1.45.1) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 213b4e2f..98325db3 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -18,7 +18,7 @@ multidict==6.0.5 mypy-extensions==1.0.0 numexpr==2.10.1 numpy==2.1.1 -openai==1.44.0 +openai==1.45.1 openapi-schema-pydantic==1.2.4 packaging>=23.2 pydantic==2.8.2 From eee1fb2c75171fc4a236bf224eda5c0df3d1fa3f Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 17 Sep 2024 10:24:34 +0200 Subject: [PATCH 029/122] chore(deps): Bump pypinyin from 0.50.0 to 0.53.0 in /backend/python/openvoice (#3562) chore(deps): Bump pypinyin in /backend/python/openvoice Bumps [pypinyin](https://github.com/mozillazg/python-pinyin) from 0.50.0 to 0.53.0. - [Release notes](https://github.com/mozillazg/python-pinyin/releases) - [Changelog](https://github.com/mozillazg/python-pinyin/blob/master/CHANGELOG.rst) - [Commits](https://github.com/mozillazg/python-pinyin/compare/v0.50.0...v0.53.0) --- updated-dependencies: - dependency-name: pypinyin dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/openvoice/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/openvoice/requirements-intel.txt b/backend/python/openvoice/requirements-intel.txt index a9a4cc20..cea7de0b 100644 --- a/backend/python/openvoice/requirements-intel.txt +++ b/backend/python/openvoice/requirements-intel.txt @@ -15,7 +15,7 @@ unidecode==1.3.7 whisper-timestamped==1.15.4 openai python-dotenv -pypinyin==0.50.0 +pypinyin==0.53.0 cn2an==0.5.22 jieba==0.42.1 gradio==4.38.1 From a53392f91953bf53c77041a8cd25282cd65eb71a Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 17 Sep 2024 16:51:40 +0200 Subject: [PATCH 030/122] chore(refactor): drop duplicated shutdown logics (#3589) * chore(refactor): drop duplicated shutdown logics - Handle locking in Shutdown and CheckModelIsLoaded in a more go-idiomatic way - Drop duplicated code and re-organize shutdown code Signed-off-by: Ettore Di Giacinto * fix: drop leftover Signed-off-by: Ettore Di Giacinto * chore: improve logging and add missing locks Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Ettore Di Giacinto --- core/http/routes/localai.go | 2 +- pkg/model/filters.go | 17 +++++++++++++++++ pkg/model/initializers.go | 16 ++++++---------- pkg/model/loader.go | 7 ++++--- pkg/model/process.go | 28 ++++------------------------ 5 files changed, 32 insertions(+), 38 deletions(-) create mode 100644 pkg/model/filters.go diff --git a/core/http/routes/localai.go b/core/http/routes/localai.go index 29fef378..247596c0 100644 --- a/core/http/routes/localai.go +++ b/core/http/routes/localai.go @@ -69,6 +69,6 @@ func RegisterLocalAIRoutes(app *fiber.App, }{Version: internal.PrintableVersion()}) }) - app.Get("/system", auth, localai.SystemInformations(ml, appConfig)) + app.Get("/system", localai.SystemInformations(ml, appConfig)) } diff --git a/pkg/model/filters.go b/pkg/model/filters.go new file mode 100644 index 00000000..79b72d5b --- /dev/null +++ b/pkg/model/filters.go @@ -0,0 +1,17 @@ +package model + +import ( + process "github.com/mudler/go-processmanager" +) + +type GRPCProcessFilter = func(id string, p *process.Process) bool + +func all(_ string, _ *process.Process) bool { + return true +} + +func allExcept(s string) GRPCProcessFilter { + return func(id string, p *process.Process) bool { + return id != s + } +} diff --git a/pkg/model/initializers.go b/pkg/model/initializers.go index 3d2255cc..7099bf33 100644 --- a/pkg/model/initializers.go +++ b/pkg/model/initializers.go @@ -320,7 +320,7 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string } else { grpcProcess := backendPath(o.assetDir, backend) if err := utils.VerifyPath(grpcProcess, o.assetDir); err != nil { - return nil, fmt.Errorf("grpc process not found in assetdir: %s", err.Error()) + return nil, fmt.Errorf("refering to a backend not in asset dir: %s", err.Error()) } if autoDetect { @@ -332,7 +332,7 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string // Check if the file exists if _, err := os.Stat(grpcProcess); os.IsNotExist(err) { - return nil, fmt.Errorf("grpc process not found: %s. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS", grpcProcess) + return nil, fmt.Errorf("backend not found: %s", grpcProcess) } serverAddress, err := getFreeAddress() @@ -355,6 +355,8 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string client = NewModel(serverAddress) } + log.Debug().Msgf("Wait for the service to start up") + // Wait for the service to start up ready := false for i := 0; i < o.grpcAttempts; i++ { @@ -413,10 +415,8 @@ func (ml *ModelLoader) BackendLoader(opts ...Option) (client grpc.Backend, err e } if o.singleActiveBackend { - ml.mu.Lock() log.Debug().Msgf("Stopping all backends except '%s'", o.model) - err := ml.StopAllExcept(o.model) - ml.mu.Unlock() + err := ml.StopGRPC(allExcept(o.model)) if err != nil { log.Error().Err(err).Str("keptModel", o.model).Msg("error while shutting down all backends except for the keptModel") return nil, err @@ -444,13 +444,10 @@ func (ml *ModelLoader) BackendLoader(opts ...Option) (client grpc.Backend, err e func (ml *ModelLoader) GreedyLoader(opts ...Option) (grpc.Backend, error) { o := NewOptions(opts...) - ml.mu.Lock() - // Return earlier if we have a model already loaded // (avoid looping through all the backends) if m := ml.CheckIsLoaded(o.model); m != nil { log.Debug().Msgf("Model '%s' already loaded", o.model) - ml.mu.Unlock() return m.GRPC(o.parallelRequests, ml.wd), nil } @@ -458,12 +455,11 @@ func (ml *ModelLoader) GreedyLoader(opts ...Option) (grpc.Backend, error) { // If we can have only one backend active, kill all the others (except external backends) if o.singleActiveBackend { log.Debug().Msgf("Stopping all backends except '%s'", o.model) - err := ml.StopAllExcept(o.model) + err := ml.StopGRPC(allExcept(o.model)) if err != nil { log.Error().Err(err).Str("keptModel", o.model).Msg("error while shutting down all backends except for the keptModel - greedyloader continuing") } } - ml.mu.Unlock() var err error diff --git a/pkg/model/loader.go b/pkg/model/loader.go index b9865f73..f70d2cea 100644 --- a/pkg/model/loader.go +++ b/pkg/model/loader.go @@ -118,9 +118,6 @@ func (ml *ModelLoader) ListModels() []*Model { } func (ml *ModelLoader) LoadModel(modelName string, loader func(string, string) (*Model, error)) (*Model, error) { - ml.mu.Lock() - defer ml.mu.Unlock() - // Check if we already have a loaded model if model := ml.CheckIsLoaded(modelName); model != nil { return model, nil @@ -139,6 +136,8 @@ func (ml *ModelLoader) LoadModel(modelName string, loader func(string, string) ( return nil, fmt.Errorf("loader didn't return a model") } + ml.mu.Lock() + defer ml.mu.Unlock() ml.models[modelName] = model return model, nil @@ -168,6 +167,8 @@ func (ml *ModelLoader) ShutdownModel(modelName string) error { } func (ml *ModelLoader) CheckIsLoaded(s string) *Model { + ml.mu.Lock() + defer ml.mu.Unlock() m, ok := ml.models[s] if !ok { return nil diff --git a/pkg/model/process.go b/pkg/model/process.go index 50afbb1c..bcd1fccb 100644 --- a/pkg/model/process.go +++ b/pkg/model/process.go @@ -9,28 +9,12 @@ import ( "strconv" "strings" "syscall" - "time" "github.com/hpcloud/tail" process "github.com/mudler/go-processmanager" "github.com/rs/zerolog/log" ) -func (ml *ModelLoader) StopAllExcept(s string) error { - return ml.StopGRPC(func(id string, p *process.Process) bool { - if id == s { - return false - } - - for ml.models[id].GRPC(false, ml.wd).IsBusy() { - log.Debug().Msgf("%s busy. Waiting.", id) - time.Sleep(2 * time.Second) - } - log.Debug().Msgf("[single-backend] Stopping %s", id) - return true - }) -} - func (ml *ModelLoader) deleteProcess(s string) error { if _, exists := ml.grpcProcesses[s]; exists { if err := ml.grpcProcesses[s].Stop(); err != nil { @@ -42,17 +26,11 @@ func (ml *ModelLoader) deleteProcess(s string) error { return nil } -type GRPCProcessFilter = func(id string, p *process.Process) bool - -func includeAllProcesses(_ string, _ *process.Process) bool { - return true -} - func (ml *ModelLoader) StopGRPC(filter GRPCProcessFilter) error { var err error = nil for k, p := range ml.grpcProcesses { if filter(k, p) { - e := ml.deleteProcess(k) + e := ml.ShutdownModel(k) err = errors.Join(err, e) } } @@ -60,10 +38,12 @@ func (ml *ModelLoader) StopGRPC(filter GRPCProcessFilter) error { } func (ml *ModelLoader) StopAllGRPC() error { - return ml.StopGRPC(includeAllProcesses) + return ml.StopGRPC(all) } func (ml *ModelLoader) GetGRPCPID(id string) (int, error) { + ml.mu.Lock() + defer ml.mu.Unlock() p, exists := ml.grpcProcesses[id] if !exists { return -1, fmt.Errorf("no grpc backend found for %s", id) From acf119828f940083451f8faa3095a5d3804ebd78 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 17 Sep 2024 17:22:56 +0200 Subject: [PATCH 031/122] Revert "chore(deps): Bump securego/gosec from 2.21.0 to 2.21.2" (#3590) Revert "chore(deps): Bump securego/gosec from 2.21.0 to 2.21.2 (#3561)" This reverts commit 12a8d0e46fbd03f8d550dc41ea6325d07d66cd00. --- .github/workflows/secscan.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/secscan.yaml b/.github/workflows/secscan.yaml index 08d7dfc6..db9db586 100644 --- a/.github/workflows/secscan.yaml +++ b/.github/workflows/secscan.yaml @@ -18,7 +18,7 @@ jobs: if: ${{ github.actor != 'dependabot[bot]' }} - name: Run Gosec Security Scanner if: ${{ github.actor != 'dependabot[bot]' }} - uses: securego/gosec@v2.21.2 + uses: securego/gosec@v2.21.0 with: # we let the report trigger content trigger a failure using the GitHub Security features. args: '-no-fail -fmt sarif -out results.sarif ./...' From dc98b2ea4474c62fbf834b421663239d6b93f534 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Tue, 17 Sep 2024 23:51:41 +0200 Subject: [PATCH 032/122] chore: :arrow_up: Update ggerganov/llama.cpp to `8b836ae731bbb2c5640bc47df5b0a78ffcb129cb` (#3591) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index f9fa5476..4493404e 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=23e0d70bacaaca1429d365a44aa9e7434f17823b +CPPLLAMA_VERSION?=8b836ae731bbb2c5640bc47df5b0a78ffcb129cb # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From e5bd74878e79b2dd819c58d9811f9573bb3c9594 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Wed, 18 Sep 2024 00:02:02 +0200 Subject: [PATCH 033/122] chore: :arrow_up: Update ggerganov/whisper.cpp to `5b1ce40fa882e9cb8630b48032067a1ed2f1534f` (#3592) :arrow_up: Update ggerganov/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 4493404e..54ae7b73 100644 --- a/Makefile +++ b/Makefile @@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6 # whisper.cpp version WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp -WHISPER_CPP_VERSION?=049b3a0e53c8a8e4c4576c06a1a4fccf0063a73f +WHISPER_CPP_VERSION?=5b1ce40fa882e9cb8630b48032067a1ed2f1534f # bert.cpp version BERT_REPO?=https://github.com/go-skynet/go-bert.cpp From a50cde69a258405ad765d3f6adf6a03aaaa6776a Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 18 Sep 2024 15:55:46 +0200 Subject: [PATCH 034/122] chore(aio): rename gpt-4-vision-preview to gpt-4o (#3597) Fixes: 3596 Signed-off-by: Ettore Di Giacinto --- aio/cpu/vision.yaml | 2 +- aio/gpu-8g/vision.yaml | 2 +- aio/intel/vision.yaml | 2 +- tests/e2e-aio/e2e_test.go | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/aio/cpu/vision.yaml b/aio/cpu/vision.yaml index 3b466d37..4052fa39 100644 --- a/aio/cpu/vision.yaml +++ b/aio/cpu/vision.yaml @@ -2,7 +2,7 @@ backend: llama-cpp context_size: 4096 f16: true mmap: true -name: gpt-4-vision-preview +name: gpt-4o roles: user: "USER:" diff --git a/aio/gpu-8g/vision.yaml b/aio/gpu-8g/vision.yaml index db039279..4f5e10b3 100644 --- a/aio/gpu-8g/vision.yaml +++ b/aio/gpu-8g/vision.yaml @@ -2,7 +2,7 @@ backend: llama-cpp context_size: 4096 f16: true mmap: true -name: gpt-4-vision-preview +name: gpt-4o roles: user: "USER:" diff --git a/aio/intel/vision.yaml b/aio/intel/vision.yaml index 52843162..37067362 100644 --- a/aio/intel/vision.yaml +++ b/aio/intel/vision.yaml @@ -2,7 +2,7 @@ backend: llama-cpp context_size: 4096 mmap: false f16: false -name: gpt-4-vision-preview +name: gpt-4o roles: user: "USER:" diff --git a/tests/e2e-aio/e2e_test.go b/tests/e2e-aio/e2e_test.go index f3f7b106..36d127d2 100644 --- a/tests/e2e-aio/e2e_test.go +++ b/tests/e2e-aio/e2e_test.go @@ -171,7 +171,7 @@ var _ = Describe("E2E test", func() { }) Context("vision", func() { It("correctly", func() { - model := "gpt-4-vision-preview" + model := "gpt-4o" resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{ Model: model, Messages: []openai.ChatCompletionMessage{ From c6a819e92fc7e687f6fe9c8a29f5b56b62820163 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Wed, 18 Sep 2024 23:41:59 +0200 Subject: [PATCH 035/122] chore: :arrow_up: Update ggerganov/llama.cpp to `64c6af3195c3cd4aa3328a1282d29cd2635c34c9` (#3598) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 54ae7b73..286f4b5a 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=8b836ae731bbb2c5640bc47df5b0a78ffcb129cb +CPPLLAMA_VERSION?=64c6af3195c3cd4aa3328a1282d29cd2635c34c9 # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From fbb9facda40eb9442ef0819b5a2de13500019229 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 19 Sep 2024 11:21:59 +0200 Subject: [PATCH 036/122] feat(api): allow to pass videos to backends (#3601) This prepares the API to receive videos as well for video understanding. It works similarly to images, where the request should be in the form: { "type": "video_url", "video_url": { "url": "url or base64 data" } } Signed-off-by: Ettore Di Giacinto --- backend/backend.proto | 1 + core/backend/llm.go | 3 +- core/http/endpoints/openai/chat.go | 6 +++- core/http/endpoints/openai/inference.go | 6 +++- core/http/endpoints/openai/request.go | 38 +++++++++++++++++-------- core/schema/openai.go | 2 ++ pkg/utils/base64.go | 10 ++----- pkg/utils/base64_test.go | 8 +++--- 8 files changed, 47 insertions(+), 27 deletions(-) diff --git a/backend/backend.proto b/backend/backend.proto index 4a8f31a9..6ef83567 100644 --- a/backend/backend.proto +++ b/backend/backend.proto @@ -134,6 +134,7 @@ message PredictOptions { repeated string Images = 42; bool UseTokenizerTemplate = 43; repeated Message Messages = 44; + repeated string Videos = 45; } // The response message containing the result diff --git a/core/backend/llm.go b/core/backend/llm.go index 2b4564a8..fa4c0709 100644 --- a/core/backend/llm.go +++ b/core/backend/llm.go @@ -31,7 +31,7 @@ type TokenUsage struct { Completion int } -func ModelInference(ctx context.Context, s string, messages []schema.Message, images []string, loader *model.ModelLoader, c config.BackendConfig, o *config.ApplicationConfig, tokenCallback func(string, TokenUsage) bool) (func() (LLMResponse, error), error) { +func ModelInference(ctx context.Context, s string, messages []schema.Message, images, videos []string, loader *model.ModelLoader, c config.BackendConfig, o *config.ApplicationConfig, tokenCallback func(string, TokenUsage) bool) (func() (LLMResponse, error), error) { modelFile := c.Model threads := c.Threads if *threads == 0 && o.Threads != 0 { @@ -101,6 +101,7 @@ func ModelInference(ctx context.Context, s string, messages []schema.Message, im opts.Messages = protoMessages opts.UseTokenizerTemplate = c.TemplateConfig.UseTokenizerTemplate opts.Images = images + opts.Videos = videos tokenUsage := TokenUsage{} diff --git a/core/http/endpoints/openai/chat.go b/core/http/endpoints/openai/chat.go index 8144bdcd..742a4add 100644 --- a/core/http/endpoints/openai/chat.go +++ b/core/http/endpoints/openai/chat.go @@ -640,8 +640,12 @@ func handleQuestion(config *config.BackendConfig, input *schema.OpenAIRequest, m for _, m := range input.Messages { images = append(images, m.StringImages...) } + videos := []string{} + for _, m := range input.Messages { + videos = append(videos, m.StringVideos...) + } - predFunc, err := backend.ModelInference(input.Context, prompt, input.Messages, images, ml, *config, o, nil) + predFunc, err := backend.ModelInference(input.Context, prompt, input.Messages, images, videos, ml, *config, o, nil) if err != nil { log.Error().Err(err).Msg("model inference failed") return "", err diff --git a/core/http/endpoints/openai/inference.go b/core/http/endpoints/openai/inference.go index 4950ce20..4008ba3d 100644 --- a/core/http/endpoints/openai/inference.go +++ b/core/http/endpoints/openai/inference.go @@ -27,9 +27,13 @@ func ComputeChoices( for _, m := range req.Messages { images = append(images, m.StringImages...) } + videos := []string{} + for _, m := range req.Messages { + videos = append(videos, m.StringVideos...) + } // get the model function to call for the result - predFunc, err := backend.ModelInference(req.Context, predInput, req.Messages, images, loader, *config, o, tokenCallback) + predFunc, err := backend.ModelInference(req.Context, predInput, req.Messages, images, videos, loader, *config, o, tokenCallback) if err != nil { return result, backend.TokenUsage{}, err } diff --git a/core/http/endpoints/openai/request.go b/core/http/endpoints/openai/request.go index a99ebea2..456a1e0c 100644 --- a/core/http/endpoints/openai/request.go +++ b/core/http/endpoints/openai/request.go @@ -135,7 +135,7 @@ func updateRequestConfig(config *config.BackendConfig, input *schema.OpenAIReque } // Decode each request's message content - index := 0 + imgIndex, vidIndex := 0, 0 for i, m := range input.Messages { switch content := m.Content.(type) { case string: @@ -144,20 +144,34 @@ func updateRequestConfig(config *config.BackendConfig, input *schema.OpenAIReque dat, _ := json.Marshal(content) c := []schema.Content{} json.Unmarshal(dat, &c) + CONTENT: for _, pp := range c { - if pp.Type == "text" { + switch pp.Type { + case "text": input.Messages[i].StringContent = pp.Text - } else if pp.Type == "image_url" { - // Detect if pp.ImageURL is an URL, if it is download the image and encode it in base64: - base64, err := utils.GetImageURLAsBase64(pp.ImageURL.URL) - if err == nil { - input.Messages[i].StringImages = append(input.Messages[i].StringImages, base64) // TODO: make sure that we only return base64 stuff - // set a placeholder for each image - input.Messages[i].StringContent = fmt.Sprintf("[img-%d]", index) + input.Messages[i].StringContent - index++ - } else { - log.Error().Msgf("Failed encoding image: %s", err) + case "video", "video_url": + // Decode content as base64 either if it's an URL or base64 text + base64, err := utils.GetContentURIAsBase64(pp.VideoURL.URL) + if err != nil { + log.Error().Msgf("Failed encoding video: %s", err) + continue CONTENT } + input.Messages[i].StringVideos = append(input.Messages[i].StringVideos, base64) // TODO: make sure that we only return base64 stuff + // set a placeholder for each image + input.Messages[i].StringContent = fmt.Sprintf("[vid-%d]", vidIndex) + input.Messages[i].StringContent + vidIndex++ + case "image_url", "image": + // Decode content as base64 either if it's an URL or base64 text + + base64, err := utils.GetContentURIAsBase64(pp.ImageURL.URL) + if err != nil { + log.Error().Msgf("Failed encoding image: %s", err) + continue CONTENT + } + input.Messages[i].StringImages = append(input.Messages[i].StringImages, base64) // TODO: make sure that we only return base64 stuff + // set a placeholder for each image + input.Messages[i].StringContent = fmt.Sprintf("[img-%d]", imgIndex) + input.Messages[i].StringContent + imgIndex++ } } } diff --git a/core/schema/openai.go b/core/schema/openai.go index fe4745bf..32ed716b 100644 --- a/core/schema/openai.go +++ b/core/schema/openai.go @@ -58,6 +58,7 @@ type Content struct { Type string `json:"type" yaml:"type"` Text string `json:"text" yaml:"text"` ImageURL ContentURL `json:"image_url" yaml:"image_url"` + VideoURL ContentURL `json:"video_url" yaml:"video_url"` } type ContentURL struct { @@ -76,6 +77,7 @@ type Message struct { StringContent string `json:"string_content,omitempty" yaml:"string_content,omitempty"` StringImages []string `json:"string_images,omitempty" yaml:"string_images,omitempty"` + StringVideos []string `json:"string_videos,omitempty" yaml:"string_videos,omitempty"` // A result of a function call FunctionCall interface{} `json:"function_call,omitempty" yaml:"function_call,omitempty"` diff --git a/pkg/utils/base64.go b/pkg/utils/base64.go index 3fbb405b..50109eaa 100644 --- a/pkg/utils/base64.go +++ b/pkg/utils/base64.go @@ -13,14 +13,8 @@ var base64DownloadClient http.Client = http.Client{ Timeout: 30 * time.Second, } -// this function check if the string is an URL, if it's an URL downloads the image in memory -// encodes it in base64 and returns the base64 string - -// This may look weird down in pkg/utils while it is currently only used in core/config -// -// but I believe it may be useful for MQTT as well in the near future, so I'm -// extracting it while I'm thinking of it. -func GetImageURLAsBase64(s string) (string, error) { +// GetContentURIAsBase64 checks if the string is an URL, if it's an URL downloads the content in memory encodes it in base64 and returns the base64 string, otherwise returns the string by stripping base64 data headers +func GetContentURIAsBase64(s string) (string, error) { if strings.HasPrefix(s, "http") { // download the image resp, err := base64DownloadClient.Get(s) diff --git a/pkg/utils/base64_test.go b/pkg/utils/base64_test.go index 3b3dc9fb..1f0d1352 100644 --- a/pkg/utils/base64_test.go +++ b/pkg/utils/base64_test.go @@ -10,20 +10,20 @@ var _ = Describe("utils/base64 tests", func() { It("GetImageURLAsBase64 can strip jpeg data url prefixes", func() { // This one doesn't actually _care_ that it's base64, so feed "bad" data in this test in order to catch a change in that behavior for informational purposes. input := "data:image/jpeg;base64,FOO" - b64, err := GetImageURLAsBase64(input) + b64, err := GetContentURIAsBase64(input) Expect(err).To(BeNil()) Expect(b64).To(Equal("FOO")) }) It("GetImageURLAsBase64 can strip png data url prefixes", func() { // This one doesn't actually _care_ that it's base64, so feed "bad" data in this test in order to catch a change in that behavior for informational purposes. input := "data:image/png;base64,BAR" - b64, err := GetImageURLAsBase64(input) + b64, err := GetContentURIAsBase64(input) Expect(err).To(BeNil()) Expect(b64).To(Equal("BAR")) }) It("GetImageURLAsBase64 returns an error for bogus data", func() { input := "FOO" - b64, err := GetImageURLAsBase64(input) + b64, err := GetContentURIAsBase64(input) Expect(b64).To(Equal("")) Expect(err).ToNot(BeNil()) Expect(err).To(MatchError("not valid string")) @@ -31,7 +31,7 @@ var _ = Describe("utils/base64 tests", func() { It("GetImageURLAsBase64 can actually download images and calculates something", func() { // This test doesn't actually _check_ the results at this time, which is bad, but there wasn't a test at all before... input := "https://upload.wikimedia.org/wikipedia/en/2/29/Wargames.jpg" - b64, err := GetImageURLAsBase64(input) + b64, err := GetContentURIAsBase64(input) Expect(err).To(BeNil()) Expect(b64).ToNot(BeNil()) }) From 191bc2e50a721bd3164ad4700bcbb5d723ed7d03 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 19 Sep 2024 12:26:53 +0200 Subject: [PATCH 037/122] feat(api): allow to pass audios to backends (#3603) Signed-off-by: Ettore Di Giacinto --- backend/backend.proto | 1 + core/backend/llm.go | 3 ++- core/http/endpoints/openai/chat.go | 6 +++++- core/http/endpoints/openai/inference.go | 6 +++++- core/http/endpoints/openai/request.go | 14 ++++++++++++-- core/schema/openai.go | 2 ++ 6 files changed, 27 insertions(+), 5 deletions(-) diff --git a/backend/backend.proto b/backend/backend.proto index 6ef83567..31bd63e5 100644 --- a/backend/backend.proto +++ b/backend/backend.proto @@ -135,6 +135,7 @@ message PredictOptions { bool UseTokenizerTemplate = 43; repeated Message Messages = 44; repeated string Videos = 45; + repeated string Audios = 46; } // The response message containing the result diff --git a/core/backend/llm.go b/core/backend/llm.go index fa4c0709..f74071ba 100644 --- a/core/backend/llm.go +++ b/core/backend/llm.go @@ -31,7 +31,7 @@ type TokenUsage struct { Completion int } -func ModelInference(ctx context.Context, s string, messages []schema.Message, images, videos []string, loader *model.ModelLoader, c config.BackendConfig, o *config.ApplicationConfig, tokenCallback func(string, TokenUsage) bool) (func() (LLMResponse, error), error) { +func ModelInference(ctx context.Context, s string, messages []schema.Message, images, videos, audios []string, loader *model.ModelLoader, c config.BackendConfig, o *config.ApplicationConfig, tokenCallback func(string, TokenUsage) bool) (func() (LLMResponse, error), error) { modelFile := c.Model threads := c.Threads if *threads == 0 && o.Threads != 0 { @@ -102,6 +102,7 @@ func ModelInference(ctx context.Context, s string, messages []schema.Message, im opts.UseTokenizerTemplate = c.TemplateConfig.UseTokenizerTemplate opts.Images = images opts.Videos = videos + opts.Audios = audios tokenUsage := TokenUsage{} diff --git a/core/http/endpoints/openai/chat.go b/core/http/endpoints/openai/chat.go index 742a4add..b937120a 100644 --- a/core/http/endpoints/openai/chat.go +++ b/core/http/endpoints/openai/chat.go @@ -644,8 +644,12 @@ func handleQuestion(config *config.BackendConfig, input *schema.OpenAIRequest, m for _, m := range input.Messages { videos = append(videos, m.StringVideos...) } + audios := []string{} + for _, m := range input.Messages { + audios = append(audios, m.StringAudios...) + } - predFunc, err := backend.ModelInference(input.Context, prompt, input.Messages, images, videos, ml, *config, o, nil) + predFunc, err := backend.ModelInference(input.Context, prompt, input.Messages, images, videos, audios, ml, *config, o, nil) if err != nil { log.Error().Err(err).Msg("model inference failed") return "", err diff --git a/core/http/endpoints/openai/inference.go b/core/http/endpoints/openai/inference.go index 4008ba3d..da75d3a1 100644 --- a/core/http/endpoints/openai/inference.go +++ b/core/http/endpoints/openai/inference.go @@ -31,9 +31,13 @@ func ComputeChoices( for _, m := range req.Messages { videos = append(videos, m.StringVideos...) } + audios := []string{} + for _, m := range req.Messages { + audios = append(audios, m.StringAudios...) + } // get the model function to call for the result - predFunc, err := backend.ModelInference(req.Context, predInput, req.Messages, images, videos, loader, *config, o, tokenCallback) + predFunc, err := backend.ModelInference(req.Context, predInput, req.Messages, images, videos, audios, loader, *config, o, tokenCallback) if err != nil { return result, backend.TokenUsage{}, err } diff --git a/core/http/endpoints/openai/request.go b/core/http/endpoints/openai/request.go index 456a1e0c..e24dd28f 100644 --- a/core/http/endpoints/openai/request.go +++ b/core/http/endpoints/openai/request.go @@ -135,7 +135,7 @@ func updateRequestConfig(config *config.BackendConfig, input *schema.OpenAIReque } // Decode each request's message content - imgIndex, vidIndex := 0, 0 + imgIndex, vidIndex, audioIndex := 0, 0, 0 for i, m := range input.Messages { switch content := m.Content.(type) { case string: @@ -160,9 +160,19 @@ func updateRequestConfig(config *config.BackendConfig, input *schema.OpenAIReque // set a placeholder for each image input.Messages[i].StringContent = fmt.Sprintf("[vid-%d]", vidIndex) + input.Messages[i].StringContent vidIndex++ + case "audio_url", "audio": + // Decode content as base64 either if it's an URL or base64 text + base64, err := utils.GetContentURIAsBase64(pp.AudioURL.URL) + if err != nil { + log.Error().Msgf("Failed encoding image: %s", err) + continue CONTENT + } + input.Messages[i].StringAudios = append(input.Messages[i].StringAudios, base64) // TODO: make sure that we only return base64 stuff + // set a placeholder for each image + input.Messages[i].StringContent = fmt.Sprintf("[audio-%d]", audioIndex) + input.Messages[i].StringContent + audioIndex++ case "image_url", "image": // Decode content as base64 either if it's an URL or base64 text - base64, err := utils.GetContentURIAsBase64(pp.ImageURL.URL) if err != nil { log.Error().Msgf("Failed encoding image: %s", err) diff --git a/core/schema/openai.go b/core/schema/openai.go index 32ed716b..15bcd13d 100644 --- a/core/schema/openai.go +++ b/core/schema/openai.go @@ -58,6 +58,7 @@ type Content struct { Type string `json:"type" yaml:"type"` Text string `json:"text" yaml:"text"` ImageURL ContentURL `json:"image_url" yaml:"image_url"` + AudioURL ContentURL `json:"audio_url" yaml:"audio_url"` VideoURL ContentURL `json:"video_url" yaml:"video_url"` } @@ -78,6 +79,7 @@ type Message struct { StringContent string `json:"string_content,omitempty" yaml:"string_content,omitempty"` StringImages []string `json:"string_images,omitempty" yaml:"string_images,omitempty"` StringVideos []string `json:"string_videos,omitempty" yaml:"string_videos,omitempty"` + StringAudios []string `json:"string_audios,omitempty" yaml:"string_audios,omitempty"` // A result of a function call FunctionCall interface{} `json:"function_call,omitempty" yaml:"function_call,omitempty"` From 5c9d26e39bdff8c3e836c686a83d1aba3c239893 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Fri, 20 Sep 2024 10:49:32 +0200 Subject: [PATCH 038/122] feat(swagger): update swagger (#3604) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- swagger/docs.go | 12 ++++++++++++ swagger/swagger.json | 12 ++++++++++++ swagger/swagger.yaml | 8 ++++++++ 3 files changed, 32 insertions(+) diff --git a/swagger/docs.go b/swagger/docs.go index 44da7cf2..ffb2ba03 100644 --- a/swagger/docs.go +++ b/swagger/docs.go @@ -1394,6 +1394,12 @@ const docTemplate = `{ "description": "The message role", "type": "string" }, + "string_audios": { + "type": "array", + "items": { + "type": "string" + } + }, "string_content": { "type": "string" }, @@ -1403,6 +1409,12 @@ const docTemplate = `{ "type": "string" } }, + "string_videos": { + "type": "array", + "items": { + "type": "string" + } + }, "tool_calls": { "type": "array", "items": { diff --git a/swagger/swagger.json b/swagger/swagger.json index eaddf451..e3aebe43 100644 --- a/swagger/swagger.json +++ b/swagger/swagger.json @@ -1387,6 +1387,12 @@ "description": "The message role", "type": "string" }, + "string_audios": { + "type": "array", + "items": { + "type": "string" + } + }, "string_content": { "type": "string" }, @@ -1396,6 +1402,12 @@ "type": "string" } }, + "string_videos": { + "type": "array", + "items": { + "type": "string" + } + }, "tool_calls": { "type": "array", "items": { diff --git a/swagger/swagger.yaml b/swagger/swagger.yaml index c98e0ef4..649b86e4 100644 --- a/swagger/swagger.yaml +++ b/swagger/swagger.yaml @@ -453,12 +453,20 @@ definitions: role: description: The message role type: string + string_audios: + items: + type: string + type: array string_content: type: string string_images: items: type: string type: array + string_videos: + items: + type: string + type: array tool_calls: items: $ref: '#/definitions/schema.ToolCall' From 2fcea486eb72d0a0bd77513244d66c74a3ec8a47 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Fri, 20 Sep 2024 10:50:14 +0200 Subject: [PATCH 039/122] chore: :arrow_up: Update ggerganov/llama.cpp to `6026da52d6942b253df835070619775d849d0258` (#3605) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 286f4b5a..53def128 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=64c6af3195c3cd4aa3328a1282d29cd2635c34c9 +CPPLLAMA_VERSION?=6026da52d6942b253df835070619775d849d0258 # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From a2a63460e92b042f274d0a4e126ef927ef78e25a Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 10:59:29 +0200 Subject: [PATCH 040/122] models(gallery): add qwen2.5-14b-instruct (#3607) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 229697bb..4fe495fc 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1,4 +1,27 @@ --- +## Qwen2.5 +- &qwen25 + name: "qwen2.5-14b-instruct" + url: "github:mudler/LocalAI/gallery/chatml.yaml@master" + license: apache-2.0 + description: | + Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. + tags: + - llm + - gguf + - gpu + - qwen + - cpu + urls: + - https://huggingface.co/bartowski/Qwen2.5-14B-Instruct-GGUF + - https://huggingface.co/Qwen/Qwen2.5-7B-Instruct + overrides: + parameters: + model: Qwen2.5-14B-Instruct-Q4_K_M.gguf + files: + - filename: Qwen2.5-14B-Instruct-Q4_K_M.gguf + sha256: e47ad95dad6ff848b431053b375adb5d39321290ea2c638682577dafca87c008 + uri: huggingface://bartowski/Qwen2.5-14B-Instruct-GGUF/Qwen2.5-14B-Instruct-Q4_K_M.gguf ## SmolLM - &smollm url: "github:mudler/LocalAI/gallery/chatml.yaml@master" From c15f506fd511dc3208846753e1fded4d0a4191f0 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 11:18:49 +0200 Subject: [PATCH 041/122] models(gallery): add qwen2.5-math-7b-instruct (#3609) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 4fe495fc..8dc742ca 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -22,6 +22,24 @@ - filename: Qwen2.5-14B-Instruct-Q4_K_M.gguf sha256: e47ad95dad6ff848b431053b375adb5d39321290ea2c638682577dafca87c008 uri: huggingface://bartowski/Qwen2.5-14B-Instruct-GGUF/Qwen2.5-14B-Instruct-Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "qwen2.5-math-7b-instruct" + urls: + - https://huggingface.co/bartowski/Qwen2.5-Math-7B-Instruct-GGUF + - https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct + description: | + In August 2024, we released the first series of mathematical LLMs - Qwen2-Math - of our Qwen family. A month later, we have upgraded it and open-sourced Qwen2.5-Math series, including base models Qwen2.5-Math-1.5B/7B/72B, instruction-tuned models Qwen2.5-Math-1.5B/7B/72B-Instruct, and mathematical reward model Qwen2.5-Math-RM-72B. + + Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT. + + The base models of Qwen2-Math are initialized with Qwen2-1.5B/7B/72B, and then pretrained on a meticulously designed Mathematics-specific Corpus. This corpus contains large-scale high-quality mathematical web texts, books, codes, exam questions, and mathematical pre-training data synthesized by Qwen2. + overrides: + parameters: + model: Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf + files: + - filename: Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf + sha256: 7e03cee8c65b9ebf9ca14ddb010aca27b6b18e6c70f2779e94e7451d9529c091 + uri: huggingface://bartowski/Qwen2.5-Math-7B-Instruct-GGUF/Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf ## SmolLM - &smollm url: "github:mudler/LocalAI/gallery/chatml.yaml@master" From a5b08f43ff5a3f485264dd0b8bd6335b0bf4ce24 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 11:22:53 +0200 Subject: [PATCH 042/122] models(gallery): add qwen2.5-14b_uncencored (#3610) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 8dc742ca..77c5c107 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -11,6 +11,7 @@ - gguf - gpu - qwen + - qwen2.5 - cpu urls: - https://huggingface.co/bartowski/Qwen2.5-14B-Instruct-GGUF @@ -40,6 +41,31 @@ - filename: Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf sha256: 7e03cee8c65b9ebf9ca14ddb010aca27b6b18e6c70f2779e94e7451d9529c091 uri: huggingface://bartowski/Qwen2.5-Math-7B-Instruct-GGUF/Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "qwen2.5-14b_uncencored" + icon: https://huggingface.co/SicariusSicariiStuff/Phi-3.5-mini-instruct_Uncensored/resolve/main/Misc/Uncensored.png + urls: + - https://huggingface.co/SicariusSicariiStuff/Qwen2.5-14B_Uncencored + - https://huggingface.co/bartowski/Qwen2.5-14B_Uncencored-GGUF + description: | + Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. + + Uncensored qwen2.5 + tags: + - llm + - gguf + - gpu + - qwen + - qwen2.5 + - cpu + - uncensored + overrides: + parameters: + model: Qwen2.5-14B_Uncencored-Q4_K_M.gguf + files: + - filename: Qwen2.5-14B_Uncencored-Q4_K_M.gguf + sha256: 066b9341b67e0fd0956de3576a3b7988574a5b9a0028aef2b9c8edeadd6dbbd1 + uri: huggingface://bartowski/Qwen2.5-14B_Uncencored-GGUF/Qwen2.5-14B_Uncencored-Q4_K_M.gguf ## SmolLM - &smollm url: "github:mudler/LocalAI/gallery/chatml.yaml@master" From b6af4f4467724bd9d59e6f7f573f513f927fc8e2 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 15:08:57 +0200 Subject: [PATCH 043/122] models(gallery): add qwen2.5-coder-7b-instruct (#3611) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 77c5c107..1f52fec8 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -66,6 +66,24 @@ - filename: Qwen2.5-14B_Uncencored-Q4_K_M.gguf sha256: 066b9341b67e0fd0956de3576a3b7988574a5b9a0028aef2b9c8edeadd6dbbd1 uri: huggingface://bartowski/Qwen2.5-14B_Uncencored-GGUF/Qwen2.5-14B_Uncencored-Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "qwen2.5-coder-7b-instruct" + urls: + - https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct + - https://huggingface.co/bartowski/Qwen2.5-Coder-7B-Instruct-GGUF + description: | + Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). For Qwen2.5-Coder, we release three base language models and instruction-tuned language models, 1.5, 7 and 32 (coming soon) billion parameters. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5: + + Significantly improvements in code generation, code reasoning and code fixing. Base on the strong Qwen2.5, we scale up the training tokens into 5.5 trillion including source code, text-code grounding, Synthetic data, etc. + A more comprehensive foundation for real-world applications such as Code Agents. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies. + Long-context Support up to 128K tokens. + overrides: + parameters: + model: Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf + files: + - filename: Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf + sha256: 1664fccab734674a50763490a8c6931b70e3f2f8ec10031b54806d30e5f956b6 + uri: huggingface://bartowski/Qwen2.5-Coder-7B-Instruct-GGUF/Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf ## SmolLM - &smollm url: "github:mudler/LocalAI/gallery/chatml.yaml@master" From 56d8f5163c427eb0e0d3b9483aa4e585f571a0bf Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 15:12:35 +0200 Subject: [PATCH 044/122] models(gallery): add qwen2.5-math-72b-instruct (#3612) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 1f52fec8..945c45b9 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -84,6 +84,23 @@ - filename: Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf sha256: 1664fccab734674a50763490a8c6931b70e3f2f8ec10031b54806d30e5f956b6 uri: huggingface://bartowski/Qwen2.5-Coder-7B-Instruct-GGUF/Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "qwen2.5-math-72b-instruct" + icon: http://qianwen-res.oss-accelerate-overseas.aliyuncs.com/Qwen2.5/qwen2.5-math-pipeline.jpeg + urls: + - https://huggingface.co/Qwen/Qwen2.5-Math-72B-Instruct + - https://huggingface.co/bartowski/Qwen2.5-Math-72B-Instruct-GGUF + description: | + In August 2024, we released the first series of mathematical LLMs - Qwen2-Math - of our Qwen family. A month later, we have upgraded it and open-sourced Qwen2.5-Math series, including base models Qwen2.5-Math-1.5B/7B/72B, instruction-tuned models Qwen2.5-Math-1.5B/7B/72B-Instruct, and mathematical reward model Qwen2.5-Math-RM-72B. + + Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT + overrides: + parameters: + model: Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf + files: + - filename: Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf + sha256: 5dee8a6e21d555577712b4f65565a3c3737a0d5d92f5a82970728c6d8e237f17 + uri: huggingface://bartowski/Qwen2.5-Math-72B-Instruct-GGUF/Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf ## SmolLM - &smollm url: "github:mudler/LocalAI/gallery/chatml.yaml@master" From 052af98dcd3a5d50cd1c7f2f0920b77e508ada5e Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 15:45:23 +0200 Subject: [PATCH 045/122] models(gallery): add qwen2.5-0.5b-instruct, qwen2.5-1.5b-instruct (#3613) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 945c45b9..adac3e51 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -101,6 +101,30 @@ - filename: Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf sha256: 5dee8a6e21d555577712b4f65565a3c3737a0d5d92f5a82970728c6d8e237f17 uri: huggingface://bartowski/Qwen2.5-Math-72B-Instruct-GGUF/Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "qwen2.5-0.5b-instruct" + urls: + - https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct + - https://huggingface.co/bartowski/Qwen2.5-0.5B-Instruct-GGUF + overrides: + parameters: + model: Qwen2.5-0.5B-Instruct-Q4_K_M.gguf + files: + - filename: Qwen2.5-0.5B-Instruct-Q4_K_M.gguf + sha256: 6eb923e7d26e9cea28811e1a8e852009b21242fb157b26149d3b188f3a8c8653 + uri: huggingface://bartowski/Qwen2.5-0.5B-Instruct-GGUF/Qwen2.5-0.5B-Instruct-Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "qwen2.5-1.5b-instruct" + urls: + - https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct + - https://huggingface.co/bartowski/Qwen2.5-1.5B-Instruct-GGUF + overrides: + parameters: + model: Qwen2.5-1.5B-Instruct-Q4_K_M.gguf + files: + - filename: Qwen2.5-1.5B-Instruct-Q4_K_M.gguf + sha256: 1adf0b11065d8ad2e8123ea110d1ec956dab4ab038eab665614adba04b6c3370 + uri: huggingface://bartowski/Qwen2.5-1.5B-Instruct-GGUF/Qwen2.5-1.5B-Instruct-Q4_K_M.gguf ## SmolLM - &smollm url: "github:mudler/LocalAI/gallery/chatml.yaml@master" From 38cad0b8dc32e3ce8d8650718c16df6725cb63dc Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 17:10:43 +0200 Subject: [PATCH 046/122] models(gallery): add qwen2.5 32B, 72B, 32B Instruct (#3614) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index adac3e51..5304f9d2 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -125,6 +125,42 @@ - filename: Qwen2.5-1.5B-Instruct-Q4_K_M.gguf sha256: 1adf0b11065d8ad2e8123ea110d1ec956dab4ab038eab665614adba04b6c3370 uri: huggingface://bartowski/Qwen2.5-1.5B-Instruct-GGUF/Qwen2.5-1.5B-Instruct-Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "qwen2.5-32b" + urls: + - https://huggingface.co/Qwen/Qwen2.5-32B + - https://huggingface.co/mradermacher/Qwen2.5-32B-GGUF + overrides: + parameters: + model: Qwen2.5-32B.Q4_K_M.gguf + files: + - filename: Qwen2.5-32B.Q4_K_M.gguf + sha256: 02703e27c8b964db445444581a6937ad7538f0c32a100b26b49fa0e8ff527155 + uri: huggingface://mradermacher/Qwen2.5-32B-GGUF/Qwen2.5-32B.Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "qwen2.5-32b-instruct" + urls: + - https://huggingface.co/Qwen/Qwen2.5-32B-Instruct + - https://huggingface.co/bartowski/Qwen2.5-32B-Instruct-GGUF + overrides: + parameters: + model: Qwen2.5-32B-Instruct-Q4_K_M.gguf + files: + - filename: Qwen2.5-32B-Instruct-Q4_K_M.gguf + sha256: 2e5f6daea180dbc59f65a40641e94d3973b5dbaa32b3c0acf54647fa874e519e + uri: huggingface://bartowski/Qwen2.5-32B-Instruct-GGUF/Qwen2.5-32B-Instruct-Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "qwen2.5-72b-instruct" + urls: + - https://huggingface.co/Qwen/Qwen2.5-72B-Instruct + - https://huggingface.co/bartowski/Qwen2.5-72B-Instruct-GGUF + overrides: + parameters: + model: Qwen2.5-72B-Instruct-Q4_K_M.gguf + files: + - filename: Qwen2.5-72B-Instruct-Q4_K_M.gguf + sha256: e4c8fad16946be8cf0bbf67eb8f4e18fc7415a5a6d2854b4cda453edb4082545 + uri: huggingface://bartowski/Qwen2.5-72B-Instruct-GGUF/Qwen2.5-72B-Instruct-Q4_K_M.gguf ## SmolLM - &smollm url: "github:mudler/LocalAI/gallery/chatml.yaml@master" From c4cecba07fda9c9db738aaaaa40756fbee3e879b Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 17:19:53 +0200 Subject: [PATCH 047/122] models(gallery): add llama-3.1-supernova-lite-reflection-v1.0-i1 (#3615) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 5304f9d2..60eed4ce 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -601,6 +601,22 @@ - filename: Reflection-Llama-3.1-70B-q4_k_m.gguf sha256: 16064e07037883a750cfeae9a7be41143aa857dbac81c2e93c68e2f941dee7b2 uri: huggingface://senseable/Reflection-Llama-3.1-70B-gguf/Reflection-Llama-3.1-70B-q4_k_m.gguf +- !!merge <<: *llama31 + name: "llama-3.1-supernova-lite-reflection-v1.0-i1" + url: "github:mudler/LocalAI/gallery/llama3.1-reflective.yaml@master" + icon: https://i.ibb.co/r072p7j/eopi-ZVu-SQ0-G-Cav78-Byq-Tg.png + urls: + - https://huggingface.co/SE6446/Llama-3.1-SuperNova-Lite-Reflection-V1.0 + - https://huggingface.co/mradermacher/Llama-3.1-SuperNova-Lite-Reflection-V1.0-i1-GGUF + description: | + This model is a LoRA adaptation of arcee-ai/Llama-3.1-SuperNova-Lite on thesven/Reflective-MAGLLAMA-v0.1.1. This has been a simple experiment into reflection and the model appears to perform adequately, though I am unsure if it is a large improvement. + overrides: + parameters: + model: Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf + files: + - filename: Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf + sha256: 0c4531fe553d00142808e1bc7348ae92d400794c5b64d2db1a974718324dfe9a + uri: huggingface://mradermacher/Llama-3.1-SuperNova-Lite-Reflection-V1.0-i1-GGUF/Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "humanish-roleplay-llama-3.1-8b-i1" From e24654ada064f0b7f6a2eb2be29b8136e52ccc0b Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 17:23:30 +0200 Subject: [PATCH 048/122] models(gallery): add llama-3.1-supernova-lite (#3616) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 60eed4ce..c05593b1 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -617,6 +617,25 @@ - filename: Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf sha256: 0c4531fe553d00142808e1bc7348ae92d400794c5b64d2db1a974718324dfe9a uri: huggingface://mradermacher/Llama-3.1-SuperNova-Lite-Reflection-V1.0-i1-GGUF/Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf +- !!merge <<: *llama31 + name: "llama-3.1-supernova-lite" + icon: https://i.ibb.co/r072p7j/eopi-ZVu-SQ0-G-Cav78-Byq-Tg.png + urls: + - https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite + - https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite-GGUF + description: | + Llama-3.1-SuperNova-Lite is an 8B parameter model developed by Arcee.ai, based on the Llama-3.1-8B-Instruct architecture. It is a distilled version of the larger Llama-3.1-405B-Instruct model, leveraging offline logits extracted from the 405B parameter variant. This 8B variation of Llama-3.1-SuperNova maintains high performance while offering exceptional instruction-following capabilities and domain-specific adaptability. + + The model was trained using a state-of-the-art distillation pipeline and an instruction dataset generated with EvolKit, ensuring accuracy and efficiency across a wide range of tasks. For more information on its training, visit blog.arcee.ai. + + Llama-3.1-SuperNova-Lite excels in both benchmark performance and real-world applications, providing the power of large-scale models in a more compact, efficient form ideal for organizations seeking high performance with reduced resource requirements. + overrides: + parameters: + model: supernova-lite-v1.Q4_K_M.gguf + files: + - filename: supernova-lite-v1.Q4_K_M.gguf + sha256: 237b7b0b704d294f92f36c576cc8fdc10592f95168a5ad0f075a2d8edf20da4d + uri: huggingface://arcee-ai/Llama-3.1-SuperNova-Lite-GGUF/supernova-lite-v1.Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "humanish-roleplay-llama-3.1-8b-i1" From f55053bfbaa9a71ea72b9efb0aa4f5347dc34574 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 17:26:59 +0200 Subject: [PATCH 049/122] models(gallery): add llama3.1-8b-shiningvaliant2 (#3617) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index c05593b1..3c3b1a23 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -636,6 +636,24 @@ - filename: supernova-lite-v1.Q4_K_M.gguf sha256: 237b7b0b704d294f92f36c576cc8fdc10592f95168a5ad0f075a2d8edf20da4d uri: huggingface://arcee-ai/Llama-3.1-SuperNova-Lite-GGUF/supernova-lite-v1.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "llama3.1-8b-shiningvaliant2" + icon: https://cdn-uploads.huggingface.co/production/uploads/63444f2687964b331809eb55/EXX7TKbB-R6arxww2mk0R.jpeg + urls: + - https://huggingface.co/ValiantLabs/Llama3.1-8B-ShiningValiant2 + - https://huggingface.co/bartowski/Llama3.1-8B-ShiningValiant2-GGUF + description: | + Shining Valiant 2 is a chat model built on Llama 3.1 8b, finetuned on our data for friendship, insight, knowledge and enthusiasm. + + Finetuned on meta-llama/Meta-Llama-3.1-8B-Instruct for best available general performance + Trained on a variety of high quality data; focused on science, engineering, technical knowledge, and structured reasoning + overrides: + parameters: + model: Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf + files: + - filename: Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf + sha256: 9369eb97922a9f01e4eae610e3d7aaeca30762d78d9239884179451d60bdbdd2 + uri: huggingface://bartowski/Llama3.1-8B-ShiningValiant2-GGUF/Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "humanish-roleplay-llama-3.1-8b-i1" From 415cf31aa3e51aa44f1097d0459f8d410e3adb27 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 17:33:29 +0200 Subject: [PATCH 050/122] models(gallery): add buddy2 (#3618) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 3c3b1a23..b46967ad 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -2065,6 +2065,20 @@ - filename: datagemma-rig-27b-it-Q4_K_M.gguf sha256: a6738ffbb49b6c46d220e2793df85c0538e9ac72398e32a0914ee5e55c3096ad uri: huggingface://bartowski/datagemma-rig-27b-it-GGUF/datagemma-rig-27b-it-Q4_K_M.gguf +- !!merge <<: *gemma + name: "buddy-2b-v1" + urls: + - https://huggingface.co/TheDrummer/Buddy-2B-v1 + - https://huggingface.co/bartowski/Buddy-2B-v1-GGUF + description: | + Buddy is designed as an empathetic language model, aimed at fostering introspection, self-reflection, and personal growth through thoughtful conversation. Buddy won't judge and it won't dismiss your concerns. Get some self-care with Buddy. + overrides: + parameters: + model: Buddy-2B-v1-Q4_K_M.gguf + files: + - filename: Buddy-2B-v1-Q4_K_M.gguf + sha256: 9bd25ed907d1a3c2e07fe09399a9b3aec107d368c29896e2c46facede5b7e3d5 + uri: huggingface://bartowski/Buddy-2B-v1-GGUF/Buddy-2B-v1-Q4_K_M.gguf - &llama3 url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master" icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png From 00d6c2a96683ffc6d169ecaeeaa9d5c5bb8384f1 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 17:35:06 +0200 Subject: [PATCH 051/122] models(gallery): add llama3.1-reflective config Signed-off-by: Ettore Di Giacinto --- gallery/llama3.1-reflective.yaml | 65 ++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) create mode 100644 gallery/llama3.1-reflective.yaml diff --git a/gallery/llama3.1-reflective.yaml b/gallery/llama3.1-reflective.yaml new file mode 100644 index 00000000..86a91d8b --- /dev/null +++ b/gallery/llama3.1-reflective.yaml @@ -0,0 +1,65 @@ +--- +name: "llama3-instruct" + +config_file: | + mmap: true + cutstrings: + - (.*?) + function: + disable_no_action: true + grammar: + disable: true + response_regex: + - \w+)>(?P.*) + template: + chat_message: | + <|start_header_id|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}<|end_header_id|> + + {{ if .FunctionCall -}} + Function call: + {{ else if eq .RoleName "tool" -}} + Function response: + {{ end -}} + {{ if .Content -}} + {{.Content -}} + {{ else if .FunctionCall -}} + {{ toJson .FunctionCall -}} + {{ end -}} + <|eot_id|> + function: | + <|start_header_id|>system<|end_header_id|> + + You have access to the following functions: + + {{range .Functions}} + Use the function '{{.Name}}' to '{{.Description}}' + {{toJson .Parameters}} + {{end}} + + Think very carefully before calling functions. + If a you choose to call a function ONLY reply in the following format with no prefix or suffix: + + {{`{{"example_name": "example_value"}}`}} + + Reminder: + - If looking for real time information use relevant functions before falling back to searching on internet + - Function calls MUST follow the specified format, start with + - Required parameters MUST be specified + - Only call one function at a time + - Put the entire function call reply on one line + <|eot_id|> + {{.Input }} + <|start_header_id|>assistant<|end_header_id|> + chat: | + {{.Input }} + <|start_header_id|>assistant<|end_header_id|> + + completion: | + {{.Input}} + context_size: 8192 + f16: true + stopwords: + - <|im_end|> + - + - "<|eot_id|>" + - <|end_of_text|> From 6c6cd8bbe0af9c93560b5eb20b8153d53625ac63 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 18:15:51 +0200 Subject: [PATCH 052/122] models(gallery): add llama-3.1-8b-arliai-rpmax-v1.1 (#3619) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index b46967ad..59cab687 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -956,6 +956,20 @@ - filename: Llama-3.1-8B-Stheno-v3.4-Q4_K_M-imat.gguf sha256: 830d4858aa11a654f82f69fa40dee819edf9ecf54213057648304eb84b8dd5eb uri: huggingface://Lewdiculous/Llama-3.1-8B-Stheno-v3.4-GGUF-IQ-Imatrix/Llama-3.1-8B-Stheno-v3.4-Q4_K_M-imat.gguf +- !!merge <<: *llama31 + name: "llama-3.1-8b-arliai-rpmax-v1.1" + urls: + - https://huggingface.co/ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.1 + - https://huggingface.co/bartowski/Llama-3.1-8B-ArliAI-RPMax-v1.1-GGUF + description: | + RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations. + overrides: + parameters: + model: Llama-3.1-8B-ArliAI-RPMax-v1.1-Q4_K_M.gguf + files: + - filename: Llama-3.1-8B-ArliAI-RPMax-v1.1-Q4_K_M.gguf + sha256: 0a601c7341228d9160332965298d799369a1dc2b7080771fb8051bdeb556b30c + uri: huggingface://bartowski/Llama-3.1-8B-ArliAI-RPMax-v1.1-GGUF/Llama-3.1-8B-ArliAI-RPMax-v1.1-Q4_K_M.gguf - &deepseek ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From bf8e50a11d2aa2ae3e27c770812a402c5c8cc6eb Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 20 Sep 2024 18:16:01 +0200 Subject: [PATCH 053/122] chore(docs): add Vulkan images links (#3620) Signed-off-by: Ettore Di Giacinto --- docs/content/docs/getting-started/container-images.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/docs/content/docs/getting-started/container-images.md b/docs/content/docs/getting-started/container-images.md index 86fe31d1..25385f23 100644 --- a/docs/content/docs/getting-started/container-images.md +++ b/docs/content/docs/getting-started/container-images.md @@ -154,7 +154,7 @@ Images are available with and without python dependencies. Note that images with Images with `core` in the tag are smaller and do not contain any python dependencies. -{{< tabs tabTotal="6" >}} +{{< tabs tabTotal="7" >}} {{% tab tabName="Vanilla / CPU Images" %}} | Description | Quay | Docker Hub | @@ -227,6 +227,15 @@ Images with `core` in the tag are smaller and do not contain any python dependen {{% /tab %}} + +{{% tab tabName="Vulkan Images" %}} +| Description | Quay | Docker Hub | +| --- | --- |-------------------------------------------------------------| +| Latest images from the branch (development) | `quay.io/go-skynet/local-ai: master-vulkan-ffmpeg-core ` | `localai/localai: master-vulkan-ffmpeg-core ` | +| Latest tag | `quay.io/go-skynet/local-ai: latest-vulkan-ffmpeg-core ` | `localai/localai: latest-vulkan-ffmpeg-core` | +| Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-vulkan-fmpeg-core` | `localai/localai:{{< version >}}-vulkan-fmpeg-core` | +{{% /tab %}} + {{< /tabs >}} ## See Also From cef7f8a0146474e1e30676ea820f8b5047bc73b2 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Fri, 20 Sep 2024 23:41:13 +0200 Subject: [PATCH 054/122] chore: :arrow_up: Update ggerganov/whisper.cpp to `34972dbe221709323714fc8402f2e24041d48213` (#3623) :arrow_up: Update ggerganov/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 53def128..89b0d4aa 100644 --- a/Makefile +++ b/Makefile @@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6 # whisper.cpp version WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp -WHISPER_CPP_VERSION?=5b1ce40fa882e9cb8630b48032067a1ed2f1534f +WHISPER_CPP_VERSION?=34972dbe221709323714fc8402f2e24041d48213 # bert.cpp version BERT_REPO?=https://github.com/go-skynet/go-bert.cpp From 54f2657870c73a100c69ad55c862cfc41f9da028 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 21 Sep 2024 10:09:41 +0200 Subject: [PATCH 055/122] chore: :arrow_up: Update ggerganov/llama.cpp to `63351143b2ea5efe9f8b9c61f553af8a51f1deff` (#3622) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 89b0d4aa..83fb1215 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=6026da52d6942b253df835070619775d849d0258 +CPPLLAMA_VERSION?=63351143b2ea5efe9f8b9c61f553af8a51f1deff # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From c22b3187a7179f1dc721d71c4e18742e173275aa Mon Sep 17 00:00:00 2001 From: lnyxaris Date: Sat, 21 Sep 2024 10:10:27 +0200 Subject: [PATCH 056/122] Fix NeuralDaredevil URL (#3621) Signed-off-by: lnyxaris --- gallery/index.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index 59cab687..7dab9eb7 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -3972,7 +3972,7 @@ files: - filename: NeuralDaredevil-8B-abliterated.Q4_K_M.gguf sha256: 12f4af9d66817d7d300bd9a181e4fe66f7ecf7ea972049f2cbd0554cdc3ecf05 - uri: huggingface://QuantFactory/NeuralDaredevil-8B-abliterated-GGUF/Poppy_Porpoise-0.85-L3-8B-Q4_K_M-imat.gguf + uri: huggingface://QuantFactory/NeuralDaredevil-8B-abliterated-GGUF/NeuralDaredevil-8B-abliterated.Q4_K_M.gguf - !!merge <<: *llama3 name: "llama-3-8b-instruct-mopeymule" urls: From 5c3d1d81e63e823278c8630b4a2a3a93ddf6af0c Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 21 Sep 2024 16:04:04 +0200 Subject: [PATCH 057/122] fix(parler-tts): fix install with sycl (#3624) Signed-off-by: Ettore Di Giacinto --- backend/python/parler-tts/install.sh | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/backend/python/parler-tts/install.sh b/backend/python/parler-tts/install.sh index 002472a2..aae690c4 100755 --- a/backend/python/parler-tts/install.sh +++ b/backend/python/parler-tts/install.sh @@ -15,5 +15,12 @@ installRequirements # https://github.com/descriptinc/audiotools/issues/101 # incompatible protobuf versions. -PYDIR=$(ls ${MY_DIR}/venv/lib) -curl -L https://raw.githubusercontent.com/protocolbuffers/protobuf/main/python/google/protobuf/internal/builder.py -o ${MY_DIR}/venv/lib/${PYDIR}/site-packages/google/protobuf/internal/builder.py +PYDIR=python3.10 +pyenv="${MY_DIR}/venv/lib/${PYDIR}/site-packages/google/protobuf/internal/" + +if [ ! -d ${pyenv} ]; then + echo "(parler-tts/install.sh): Error: ${pyenv} does not exist" + exit 1 +fi + +curl -L https://raw.githubusercontent.com/protocolbuffers/protobuf/main/python/google/protobuf/internal/builder.py -o ${pyenv}/builder.py From 20c0e128c00601edb7e46089c1e32672f353c52e Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 21 Sep 2024 21:52:12 +0200 Subject: [PATCH 058/122] fix(sycl): downgrade pypinyin melotts requires pypinyin 0.50 Signed-off-by: Ettore Di Giacinto --- backend/python/openvoice/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/openvoice/requirements-intel.txt b/backend/python/openvoice/requirements-intel.txt index cea7de0b..a9a4cc20 100644 --- a/backend/python/openvoice/requirements-intel.txt +++ b/backend/python/openvoice/requirements-intel.txt @@ -15,7 +15,7 @@ unidecode==1.3.7 whisper-timestamped==1.15.4 openai python-dotenv -pypinyin==0.53.0 +pypinyin==0.50.0 cn2an==0.5.22 jieba==0.42.1 gradio==4.38.1 From 1f43678d5311e7bdc434768ea74c97e49a6ebc7e Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sun, 22 Sep 2024 00:03:23 +0200 Subject: [PATCH 059/122] chore: :arrow_up: Update ggerganov/llama.cpp to `d09770cae71b416c032ec143dda530f7413c4038` (#3626) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 83fb1215..51755e71 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=63351143b2ea5efe9f8b9c61f553af8a51f1deff +CPPLLAMA_VERSION?=d09770cae71b416c032ec143dda530f7413c4038 # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From ee21b00a8d6b652b61d075e3bba1b88c8d52488c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Serta=C3=A7=20=C3=96zercan?= <852750+sozercan@users.noreply.github.com> Date: Sun, 22 Sep 2024 01:03:30 -0700 Subject: [PATCH 060/122] feat: auto load into memory on startup (#3627) Signed-off-by: Sertac Ozercan --- core/backend/embeddings.go | 2 +- core/backend/image.go | 2 +- core/backend/llm.go | 2 +- core/backend/options.go | 2 +- core/backend/rerank.go | 2 +- core/backend/soundgeneration.go | 2 +- core/backend/tts.go | 2 +- core/cli/run.go | 2 + core/config/application_config.go | 7 + core/startup/startup.go | 449 ++++++++++++++++-------------- 10 files changed, 259 insertions(+), 213 deletions(-) diff --git a/core/backend/embeddings.go b/core/backend/embeddings.go index 31b10a19..9f0f8be9 100644 --- a/core/backend/embeddings.go +++ b/core/backend/embeddings.go @@ -12,7 +12,7 @@ import ( func ModelEmbedding(s string, tokens []int, loader *model.ModelLoader, backendConfig config.BackendConfig, appConfig *config.ApplicationConfig) (func() ([]float32, error), error) { modelFile := backendConfig.Model - grpcOpts := gRPCModelOpts(backendConfig) + grpcOpts := GRPCModelOpts(backendConfig) var inferenceModel interface{} var err error diff --git a/core/backend/image.go b/core/backend/image.go index 8c3f56b3..5c2a950c 100644 --- a/core/backend/image.go +++ b/core/backend/image.go @@ -12,7 +12,7 @@ func ImageGeneration(height, width, mode, step, seed int, positive_prompt, negat if *threads == 0 && appConfig.Threads != 0 { threads = &appConfig.Threads } - gRPCOpts := gRPCModelOpts(backendConfig) + gRPCOpts := GRPCModelOpts(backendConfig) opts := modelOpts(backendConfig, appConfig, []model.Option{ model.WithBackendString(backendConfig.Backend), model.WithAssetDir(appConfig.AssetsDestination), diff --git a/core/backend/llm.go b/core/backend/llm.go index f74071ba..cac9beba 100644 --- a/core/backend/llm.go +++ b/core/backend/llm.go @@ -37,7 +37,7 @@ func ModelInference(ctx context.Context, s string, messages []schema.Message, im if *threads == 0 && o.Threads != 0 { threads = &o.Threads } - grpcOpts := gRPCModelOpts(c) + grpcOpts := GRPCModelOpts(c) var inferenceModel grpc.Backend var err error diff --git a/core/backend/options.go b/core/backend/options.go index d986b8e6..d431aab6 100644 --- a/core/backend/options.go +++ b/core/backend/options.go @@ -44,7 +44,7 @@ func getSeed(c config.BackendConfig) int32 { return seed } -func gRPCModelOpts(c config.BackendConfig) *pb.ModelOptions { +func GRPCModelOpts(c config.BackendConfig) *pb.ModelOptions { b := 512 if c.Batch != 0 { b = c.Batch diff --git a/core/backend/rerank.go b/core/backend/rerank.go index 1b718be2..a7573ade 100644 --- a/core/backend/rerank.go +++ b/core/backend/rerank.go @@ -15,7 +15,7 @@ func Rerank(backend, modelFile string, request *proto.RerankRequest, loader *mod return nil, fmt.Errorf("backend is required") } - grpcOpts := gRPCModelOpts(backendConfig) + grpcOpts := GRPCModelOpts(backendConfig) opts := modelOpts(config.BackendConfig{}, appConfig, []model.Option{ model.WithBackendString(bb), diff --git a/core/backend/soundgeneration.go b/core/backend/soundgeneration.go index abd5221b..b6a1c827 100644 --- a/core/backend/soundgeneration.go +++ b/core/backend/soundgeneration.go @@ -29,7 +29,7 @@ func SoundGeneration( return "", nil, fmt.Errorf("backend is a required parameter") } - grpcOpts := gRPCModelOpts(backendConfig) + grpcOpts := GRPCModelOpts(backendConfig) opts := modelOpts(config.BackendConfig{}, appConfig, []model.Option{ model.WithBackendString(backend), model.WithModel(modelFile), diff --git a/core/backend/tts.go b/core/backend/tts.go index 258882ae..2401748c 100644 --- a/core/backend/tts.go +++ b/core/backend/tts.go @@ -28,7 +28,7 @@ func ModelTTS( bb = model.PiperBackend } - grpcOpts := gRPCModelOpts(backendConfig) + grpcOpts := GRPCModelOpts(backendConfig) opts := modelOpts(config.BackendConfig{}, appConfig, []model.Option{ model.WithBackendString(bb), diff --git a/core/cli/run.go b/core/cli/run.go index afb7204c..a67839a0 100644 --- a/core/cli/run.go +++ b/core/cli/run.go @@ -69,6 +69,7 @@ type RunCMD struct { WatchdogBusyTimeout string `env:"LOCALAI_WATCHDOG_BUSY_TIMEOUT,WATCHDOG_BUSY_TIMEOUT" default:"5m" help:"Threshold beyond which a busy backend should be stopped" group:"backends"` Federated bool `env:"LOCALAI_FEDERATED,FEDERATED" help:"Enable federated instance" group:"federated"` DisableGalleryEndpoint bool `env:"LOCALAI_DISABLE_GALLERY_ENDPOINT,DISABLE_GALLERY_ENDPOINT" help:"Disable the gallery endpoints" group:"api"` + LoadToMemory []string `env:"LOCALAI_LOAD_TO_MEMORY,LOAD_TO_MEMORY" help:"A list of models to load into memory at startup" group:"models"` } func (r *RunCMD) Run(ctx *cliContext.Context) error { @@ -104,6 +105,7 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error { config.WithDisableApiKeyRequirementForHttpGet(r.DisableApiKeyRequirementForHttpGet), config.WithHttpGetExemptedEndpoints(r.HttpGetExemptedEndpoints), config.WithP2PNetworkID(r.Peer2PeerNetworkID), + config.WithLoadToMemory(r.LoadToMemory), } token := "" diff --git a/core/config/application_config.go b/core/config/application_config.go index afbf325f..2af0c7ae 100644 --- a/core/config/application_config.go +++ b/core/config/application_config.go @@ -41,6 +41,7 @@ type ApplicationConfig struct { DisableApiKeyRequirementForHttpGet bool HttpGetExemptedEndpoints []*regexp.Regexp DisableGalleryEndpoint bool + LoadToMemory []string ModelLibraryURL string @@ -331,6 +332,12 @@ func WithOpaqueErrors(opaque bool) AppOption { } } +func WithLoadToMemory(models []string) AppOption { + return func(o *ApplicationConfig) { + o.LoadToMemory = models + } +} + func WithSubtleKeyComparison(subtle bool) AppOption { return func(o *ApplicationConfig) { o.UseSubtleKeyComparison = subtle diff --git a/core/startup/startup.go b/core/startup/startup.go index 3565d196..b7b9ce8f 100644 --- a/core/startup/startup.go +++ b/core/startup/startup.go @@ -1,206 +1,243 @@ -package startup - -import ( - "fmt" - "os" - - "github.com/mudler/LocalAI/core" - "github.com/mudler/LocalAI/core/config" - "github.com/mudler/LocalAI/core/services" - "github.com/mudler/LocalAI/internal" - "github.com/mudler/LocalAI/pkg/assets" - "github.com/mudler/LocalAI/pkg/library" - "github.com/mudler/LocalAI/pkg/model" - pkgStartup "github.com/mudler/LocalAI/pkg/startup" - "github.com/mudler/LocalAI/pkg/xsysinfo" - "github.com/rs/zerolog/log" -) - -func Startup(opts ...config.AppOption) (*config.BackendConfigLoader, *model.ModelLoader, *config.ApplicationConfig, error) { - options := config.NewApplicationConfig(opts...) - - log.Info().Msgf("Starting LocalAI using %d threads, with models path: %s", options.Threads, options.ModelPath) - log.Info().Msgf("LocalAI version: %s", internal.PrintableVersion()) - caps, err := xsysinfo.CPUCapabilities() - if err == nil { - log.Debug().Msgf("CPU capabilities: %v", caps) - } - gpus, err := xsysinfo.GPUs() - if err == nil { - log.Debug().Msgf("GPU count: %d", len(gpus)) - for _, gpu := range gpus { - log.Debug().Msgf("GPU: %s", gpu.String()) - } - } - - // Make sure directories exists - if options.ModelPath == "" { - return nil, nil, nil, fmt.Errorf("options.ModelPath cannot be empty") - } - err = os.MkdirAll(options.ModelPath, 0750) - if err != nil { - return nil, nil, nil, fmt.Errorf("unable to create ModelPath: %q", err) - } - if options.ImageDir != "" { - err := os.MkdirAll(options.ImageDir, 0750) - if err != nil { - return nil, nil, nil, fmt.Errorf("unable to create ImageDir: %q", err) - } - } - if options.AudioDir != "" { - err := os.MkdirAll(options.AudioDir, 0750) - if err != nil { - return nil, nil, nil, fmt.Errorf("unable to create AudioDir: %q", err) - } - } - if options.UploadDir != "" { - err := os.MkdirAll(options.UploadDir, 0750) - if err != nil { - return nil, nil, nil, fmt.Errorf("unable to create UploadDir: %q", err) - } - } - - if err := pkgStartup.InstallModels(options.Galleries, options.ModelLibraryURL, options.ModelPath, options.EnforcePredownloadScans, nil, options.ModelsURL...); err != nil { - log.Error().Err(err).Msg("error installing models") - } - - cl := config.NewBackendConfigLoader(options.ModelPath) - ml := model.NewModelLoader(options.ModelPath) - - configLoaderOpts := options.ToConfigLoaderOptions() - - if err := cl.LoadBackendConfigsFromPath(options.ModelPath, configLoaderOpts...); err != nil { - log.Error().Err(err).Msg("error loading config files") - } - - if options.ConfigFile != "" { - if err := cl.LoadMultipleBackendConfigsSingleFile(options.ConfigFile, configLoaderOpts...); err != nil { - log.Error().Err(err).Msg("error loading config file") - } - } - - if err := cl.Preload(options.ModelPath); err != nil { - log.Error().Err(err).Msg("error downloading models") - } - - if options.PreloadJSONModels != "" { - if err := services.ApplyGalleryFromString(options.ModelPath, options.PreloadJSONModels, options.EnforcePredownloadScans, options.Galleries); err != nil { - return nil, nil, nil, err - } - } - - if options.PreloadModelsFromPath != "" { - if err := services.ApplyGalleryFromFile(options.ModelPath, options.PreloadModelsFromPath, options.EnforcePredownloadScans, options.Galleries); err != nil { - return nil, nil, nil, err - } - } - - if options.Debug { - for _, v := range cl.GetAllBackendConfigs() { - log.Debug().Msgf("Model: %s (config: %+v)", v.Name, v) - } - } - - if options.AssetsDestination != "" { - // Extract files from the embedded FS - err := assets.ExtractFiles(options.BackendAssets, options.AssetsDestination) - log.Debug().Msgf("Extracting backend assets files to %s", options.AssetsDestination) - if err != nil { - log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly)", err) - } - } - - if options.LibPath != "" { - // If there is a lib directory, set LD_LIBRARY_PATH to include it - err := library.LoadExternal(options.LibPath) - if err != nil { - log.Error().Err(err).Str("LibPath", options.LibPath).Msg("Error while loading external libraries") - } - } - - // turn off any process that was started by GRPC if the context is canceled - go func() { - <-options.Context.Done() - log.Debug().Msgf("Context canceled, shutting down") - err := ml.StopAllGRPC() - if err != nil { - log.Error().Err(err).Msg("error while stopping all grpc backends") - } - }() - - if options.WatchDog { - wd := model.NewWatchDog( - ml, - options.WatchDogBusyTimeout, - options.WatchDogIdleTimeout, - options.WatchDogBusy, - options.WatchDogIdle) - ml.SetWatchDog(wd) - go wd.Run() - go func() { - <-options.Context.Done() - log.Debug().Msgf("Context canceled, shutting down") - wd.Shutdown() - }() - } - - // Watch the configuration directory - startWatcher(options) - - log.Info().Msg("core/startup process completed!") - return cl, ml, options, nil -} - -func startWatcher(options *config.ApplicationConfig) { - if options.DynamicConfigsDir == "" { - // No need to start the watcher if the directory is not set - return - } - - if _, err := os.Stat(options.DynamicConfigsDir); err != nil { - if os.IsNotExist(err) { - // We try to create the directory if it does not exist and was specified - if err := os.MkdirAll(options.DynamicConfigsDir, 0700); err != nil { - log.Error().Err(err).Msg("failed creating DynamicConfigsDir") - } - } else { - // something else happened, we log the error and don't start the watcher - log.Error().Err(err).Msg("failed to read DynamicConfigsDir, watcher will not be started") - return - } - } - - configHandler := newConfigFileHandler(options) - if err := configHandler.Watch(); err != nil { - log.Error().Err(err).Msg("failed creating watcher") - } -} - -// In Lieu of a proper DI framework, this function wires up the Application manually. -// This is in core/startup rather than core/state.go to keep package references clean! -func createApplication(appConfig *config.ApplicationConfig) *core.Application { - app := &core.Application{ - ApplicationConfig: appConfig, - BackendConfigLoader: config.NewBackendConfigLoader(appConfig.ModelPath), - ModelLoader: model.NewModelLoader(appConfig.ModelPath), - } - - var err error - - // app.EmbeddingsBackendService = backend.NewEmbeddingsBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig) - // app.ImageGenerationBackendService = backend.NewImageGenerationBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig) - // app.LLMBackendService = backend.NewLLMBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig) - // app.TranscriptionBackendService = backend.NewTranscriptionBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig) - // app.TextToSpeechBackendService = backend.NewTextToSpeechBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig) - - app.BackendMonitorService = services.NewBackendMonitorService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig) - app.GalleryService = services.NewGalleryService(app.ApplicationConfig) - // app.OpenAIService = services.NewOpenAIService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig, app.LLMBackendService) - - app.LocalAIMetricsService, err = services.NewLocalAIMetricsService() - if err != nil { - log.Error().Err(err).Msg("encountered an error initializing metrics service, startup will continue but metrics will not be tracked.") - } - - return app -} +package startup + +import ( + "fmt" + "os" + + "github.com/mudler/LocalAI/core" + "github.com/mudler/LocalAI/core/backend" + "github.com/mudler/LocalAI/core/config" + "github.com/mudler/LocalAI/core/services" + "github.com/mudler/LocalAI/internal" + "github.com/mudler/LocalAI/pkg/assets" + "github.com/mudler/LocalAI/pkg/library" + "github.com/mudler/LocalAI/pkg/model" + pkgStartup "github.com/mudler/LocalAI/pkg/startup" + "github.com/mudler/LocalAI/pkg/xsysinfo" + "github.com/rs/zerolog/log" +) + +func Startup(opts ...config.AppOption) (*config.BackendConfigLoader, *model.ModelLoader, *config.ApplicationConfig, error) { + options := config.NewApplicationConfig(opts...) + + log.Info().Msgf("Starting LocalAI using %d threads, with models path: %s", options.Threads, options.ModelPath) + log.Info().Msgf("LocalAI version: %s", internal.PrintableVersion()) + caps, err := xsysinfo.CPUCapabilities() + if err == nil { + log.Debug().Msgf("CPU capabilities: %v", caps) + } + gpus, err := xsysinfo.GPUs() + if err == nil { + log.Debug().Msgf("GPU count: %d", len(gpus)) + for _, gpu := range gpus { + log.Debug().Msgf("GPU: %s", gpu.String()) + } + } + + // Make sure directories exists + if options.ModelPath == "" { + return nil, nil, nil, fmt.Errorf("options.ModelPath cannot be empty") + } + err = os.MkdirAll(options.ModelPath, 0750) + if err != nil { + return nil, nil, nil, fmt.Errorf("unable to create ModelPath: %q", err) + } + if options.ImageDir != "" { + err := os.MkdirAll(options.ImageDir, 0750) + if err != nil { + return nil, nil, nil, fmt.Errorf("unable to create ImageDir: %q", err) + } + } + if options.AudioDir != "" { + err := os.MkdirAll(options.AudioDir, 0750) + if err != nil { + return nil, nil, nil, fmt.Errorf("unable to create AudioDir: %q", err) + } + } + if options.UploadDir != "" { + err := os.MkdirAll(options.UploadDir, 0750) + if err != nil { + return nil, nil, nil, fmt.Errorf("unable to create UploadDir: %q", err) + } + } + + if err := pkgStartup.InstallModels(options.Galleries, options.ModelLibraryURL, options.ModelPath, options.EnforcePredownloadScans, nil, options.ModelsURL...); err != nil { + log.Error().Err(err).Msg("error installing models") + } + + cl := config.NewBackendConfigLoader(options.ModelPath) + ml := model.NewModelLoader(options.ModelPath) + + configLoaderOpts := options.ToConfigLoaderOptions() + + if err := cl.LoadBackendConfigsFromPath(options.ModelPath, configLoaderOpts...); err != nil { + log.Error().Err(err).Msg("error loading config files") + } + + if options.ConfigFile != "" { + if err := cl.LoadMultipleBackendConfigsSingleFile(options.ConfigFile, configLoaderOpts...); err != nil { + log.Error().Err(err).Msg("error loading config file") + } + } + + if err := cl.Preload(options.ModelPath); err != nil { + log.Error().Err(err).Msg("error downloading models") + } + + if options.PreloadJSONModels != "" { + if err := services.ApplyGalleryFromString(options.ModelPath, options.PreloadJSONModels, options.EnforcePredownloadScans, options.Galleries); err != nil { + return nil, nil, nil, err + } + } + + if options.PreloadModelsFromPath != "" { + if err := services.ApplyGalleryFromFile(options.ModelPath, options.PreloadModelsFromPath, options.EnforcePredownloadScans, options.Galleries); err != nil { + return nil, nil, nil, err + } + } + + if options.Debug { + for _, v := range cl.GetAllBackendConfigs() { + log.Debug().Msgf("Model: %s (config: %+v)", v.Name, v) + } + } + + if options.AssetsDestination != "" { + // Extract files from the embedded FS + err := assets.ExtractFiles(options.BackendAssets, options.AssetsDestination) + log.Debug().Msgf("Extracting backend assets files to %s", options.AssetsDestination) + if err != nil { + log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly)", err) + } + } + + if options.LibPath != "" { + // If there is a lib directory, set LD_LIBRARY_PATH to include it + err := library.LoadExternal(options.LibPath) + if err != nil { + log.Error().Err(err).Str("LibPath", options.LibPath).Msg("Error while loading external libraries") + } + } + + // turn off any process that was started by GRPC if the context is canceled + go func() { + <-options.Context.Done() + log.Debug().Msgf("Context canceled, shutting down") + err := ml.StopAllGRPC() + if err != nil { + log.Error().Err(err).Msg("error while stopping all grpc backends") + } + }() + + if options.WatchDog { + wd := model.NewWatchDog( + ml, + options.WatchDogBusyTimeout, + options.WatchDogIdleTimeout, + options.WatchDogBusy, + options.WatchDogIdle) + ml.SetWatchDog(wd) + go wd.Run() + go func() { + <-options.Context.Done() + log.Debug().Msgf("Context canceled, shutting down") + wd.Shutdown() + }() + } + + if options.LoadToMemory != nil { + for _, m := range options.LoadToMemory { + cfg, err := cl.LoadBackendConfigFileByName(m, options.ModelPath, + config.LoadOptionDebug(options.Debug), + config.LoadOptionThreads(options.Threads), + config.LoadOptionContextSize(options.ContextSize), + config.LoadOptionF16(options.F16), + config.ModelPath(options.ModelPath), + ) + if err != nil { + return nil, nil, nil, err + } + + log.Debug().Msgf("Auto loading model %s into memory from file: %s", m, cfg.Model) + + grpcOpts := backend.GRPCModelOpts(*cfg) + o := []model.Option{ + model.WithModel(cfg.Model), + model.WithAssetDir(options.AssetsDestination), + model.WithThreads(uint32(options.Threads)), + model.WithLoadGRPCLoadModelOpts(grpcOpts), + } + + var backendErr error + if cfg.Backend != "" { + o = append(o, model.WithBackendString(cfg.Backend)) + _, backendErr = ml.BackendLoader(o...) + } else { + _, backendErr = ml.GreedyLoader(o...) + } + if backendErr != nil { + return nil, nil, nil, err + } + } + } + + // Watch the configuration directory + startWatcher(options) + + log.Info().Msg("core/startup process completed!") + return cl, ml, options, nil +} + +func startWatcher(options *config.ApplicationConfig) { + if options.DynamicConfigsDir == "" { + // No need to start the watcher if the directory is not set + return + } + + if _, err := os.Stat(options.DynamicConfigsDir); err != nil { + if os.IsNotExist(err) { + // We try to create the directory if it does not exist and was specified + if err := os.MkdirAll(options.DynamicConfigsDir, 0700); err != nil { + log.Error().Err(err).Msg("failed creating DynamicConfigsDir") + } + } else { + // something else happened, we log the error and don't start the watcher + log.Error().Err(err).Msg("failed to read DynamicConfigsDir, watcher will not be started") + return + } + } + + configHandler := newConfigFileHandler(options) + if err := configHandler.Watch(); err != nil { + log.Error().Err(err).Msg("failed creating watcher") + } +} + +// In Lieu of a proper DI framework, this function wires up the Application manually. +// This is in core/startup rather than core/state.go to keep package references clean! +func createApplication(appConfig *config.ApplicationConfig) *core.Application { + app := &core.Application{ + ApplicationConfig: appConfig, + BackendConfigLoader: config.NewBackendConfigLoader(appConfig.ModelPath), + ModelLoader: model.NewModelLoader(appConfig.ModelPath), + } + + var err error + + // app.EmbeddingsBackendService = backend.NewEmbeddingsBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig) + // app.ImageGenerationBackendService = backend.NewImageGenerationBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig) + // app.LLMBackendService = backend.NewLLMBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig) + // app.TranscriptionBackendService = backend.NewTranscriptionBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig) + // app.TextToSpeechBackendService = backend.NewTextToSpeechBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig) + + app.BackendMonitorService = services.NewBackendMonitorService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig) + app.GalleryService = services.NewGalleryService(app.ApplicationConfig) + // app.OpenAIService = services.NewOpenAIService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig, app.LLMBackendService) + + app.LocalAIMetricsService, err = services.NewLocalAIMetricsService() + if err != nil { + log.Error().Err(err).Msg("encountered an error initializing metrics service, startup will continue but metrics will not be tracked.") + } + + return app +} From 9bd7f3f995c6a9c9d8e4cab49cb1970a70629efc Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 22 Sep 2024 10:04:20 +0200 Subject: [PATCH 061/122] feat(coqui): switch to maintained community fork (#3625) Fixes: https://github.com/mudler/LocalAI/issues/2513 Signed-off-by: Ettore Di Giacinto --- backend/python/coqui/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/coqui/requirements.txt b/backend/python/coqui/requirements.txt index d7708363..2a91f2b9 100644 --- a/backend/python/coqui/requirements.txt +++ b/backend/python/coqui/requirements.txt @@ -1,4 +1,4 @@ -TTS==0.22.0 +coqui-tts grpcio==1.66.1 protobuf certifi \ No newline at end of file From 56f4deb938ee045b2df3b517b7e25c28df252ef5 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 22 Sep 2024 15:19:38 +0200 Subject: [PATCH 062/122] chore(ci): split hipblas jobs Signed-off-by: Ettore Di Giacinto --- .github/workflows/image.yml | 115 ++++++++++++++++++++++-------------- 1 file changed, 72 insertions(+), 43 deletions(-) diff --git a/.github/workflows/image.yml b/.github/workflows/image.yml index 395d7761..f57cf770 100644 --- a/.github/workflows/image.yml +++ b/.github/workflows/image.yml @@ -13,6 +13,78 @@ concurrency: cancel-in-progress: true jobs: + hipblas-jobs: + uses: ./.github/workflows/image_build.yml + with: + tag-latest: ${{ matrix.tag-latest }} + tag-suffix: ${{ matrix.tag-suffix }} + ffmpeg: ${{ matrix.ffmpeg }} + image-type: ${{ matrix.image-type }} + build-type: ${{ matrix.build-type }} + cuda-major-version: ${{ matrix.cuda-major-version }} + cuda-minor-version: ${{ matrix.cuda-minor-version }} + platforms: ${{ matrix.platforms }} + runs-on: ${{ matrix.runs-on }} + base-image: ${{ matrix.base-image }} + grpc-base-image: ${{ matrix.grpc-base-image }} + aio: ${{ matrix.aio }} + makeflags: ${{ matrix.makeflags }} + latest-image: ${{ matrix.latest-image }} + latest-image-aio: ${{ matrix.latest-image-aio }} + secrets: + dockerUsername: ${{ secrets.DOCKERHUB_USERNAME }} + dockerPassword: ${{ secrets.DOCKERHUB_PASSWORD }} + quayUsername: ${{ secrets.LOCALAI_REGISTRY_USERNAME }} + quayPassword: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }} + strategy: + # Pushing with all jobs in parallel + # eats the bandwidth of all the nodes + max-parallel: 1 + matrix: + include: + - build-type: 'hipblas' + platforms: 'linux/amd64' + tag-latest: 'auto' + tag-suffix: '-hipblas-ffmpeg' + ffmpeg: 'true' + image-type: 'extras' + aio: "-aio-gpu-hipblas" + base-image: "rocm/dev-ubuntu-22.04:6.1" + grpc-base-image: "ubuntu:22.04" + latest-image: 'latest-gpu-hipblas' + latest-image-aio: 'latest-aio-gpu-hipblas' + runs-on: 'arc-runner-set' + makeflags: "--jobs=3 --output-sync=target" + - build-type: 'hipblas' + platforms: 'linux/amd64' + tag-latest: 'false' + tag-suffix: '-hipblas' + ffmpeg: 'false' + image-type: 'extras' + base-image: "rocm/dev-ubuntu-22.04:6.1" + grpc-base-image: "ubuntu:22.04" + runs-on: 'arc-runner-set' + makeflags: "--jobs=3 --output-sync=target" + - build-type: 'hipblas' + platforms: 'linux/amd64' + tag-latest: 'false' + tag-suffix: '-hipblas-ffmpeg-core' + ffmpeg: 'true' + image-type: 'core' + base-image: "rocm/dev-ubuntu-22.04:6.1" + grpc-base-image: "ubuntu:22.04" + runs-on: 'arc-runner-set' + makeflags: "--jobs=3 --output-sync=target" + - build-type: 'hipblas' + platforms: 'linux/amd64' + tag-latest: 'false' + tag-suffix: '-hipblas-core' + ffmpeg: 'false' + image-type: 'core' + base-image: "rocm/dev-ubuntu-22.04:6.1" + grpc-base-image: "ubuntu:22.04" + runs-on: 'arc-runner-set' + makeflags: "--jobs=3 --output-sync=target" self-hosted-jobs: uses: ./.github/workflows/image_build.yml with: @@ -122,29 +194,6 @@ jobs: base-image: "ubuntu:22.04" runs-on: 'arc-runner-set' makeflags: "--jobs=3 --output-sync=target" - - build-type: 'hipblas' - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-hipblas-ffmpeg' - ffmpeg: 'true' - image-type: 'extras' - aio: "-aio-gpu-hipblas" - base-image: "rocm/dev-ubuntu-22.04:6.1" - grpc-base-image: "ubuntu:22.04" - latest-image: 'latest-gpu-hipblas' - latest-image-aio: 'latest-aio-gpu-hipblas' - runs-on: 'arc-runner-set' - makeflags: "--jobs=3 --output-sync=target" - - build-type: 'hipblas' - platforms: 'linux/amd64' - tag-latest: 'false' - tag-suffix: '-hipblas' - ffmpeg: 'false' - image-type: 'extras' - base-image: "rocm/dev-ubuntu-22.04:6.1" - grpc-base-image: "ubuntu:22.04" - runs-on: 'arc-runner-set' - makeflags: "--jobs=3 --output-sync=target" - build-type: 'sycl_f16' platforms: 'linux/amd64' tag-latest: 'auto' @@ -212,26 +261,6 @@ jobs: image-type: 'core' runs-on: 'arc-runner-set' makeflags: "--jobs=3 --output-sync=target" - - build-type: 'hipblas' - platforms: 'linux/amd64' - tag-latest: 'false' - tag-suffix: '-hipblas-ffmpeg-core' - ffmpeg: 'true' - image-type: 'core' - base-image: "rocm/dev-ubuntu-22.04:6.1" - grpc-base-image: "ubuntu:22.04" - runs-on: 'arc-runner-set' - makeflags: "--jobs=3 --output-sync=target" - - build-type: 'hipblas' - platforms: 'linux/amd64' - tag-latest: 'false' - tag-suffix: '-hipblas-core' - ffmpeg: 'false' - image-type: 'core' - base-image: "rocm/dev-ubuntu-22.04:6.1" - grpc-base-image: "ubuntu:22.04" - runs-on: 'arc-runner-set' - makeflags: "--jobs=3 --output-sync=target" core-image-build: uses: ./.github/workflows/image_build.yml From fd70a22196ffc430e286c14e65497dd22f9d3b63 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 22 Sep 2024 15:21:16 +0200 Subject: [PATCH 063/122] chore(ci): adjust parallel jobs Signed-off-by: Ettore Di Giacinto --- .github/workflows/image.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/image.yml b/.github/workflows/image.yml index f57cf770..8709f05c 100644 --- a/.github/workflows/image.yml +++ b/.github/workflows/image.yml @@ -111,7 +111,7 @@ jobs: strategy: # Pushing with all jobs in parallel # eats the bandwidth of all the nodes - max-parallel: ${{ github.event_name != 'pull_request' && 6 || 10 }} + max-parallel: ${{ github.event_name != 'pull_request' && 5 || 8 }} matrix: include: # Extra images From 4edd8c80b407ea415e4cbede6386f8d17efa8f8f Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sun, 22 Sep 2024 23:41:34 +0200 Subject: [PATCH 064/122] chore: :arrow_up: Update ggerganov/llama.cpp to `c35e586ea57221844442c65a1172498c54971cb0` (#3629) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 51755e71..fe086645 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=d09770cae71b416c032ec143dda530f7413c4038 +CPPLLAMA_VERSION?=c35e586ea57221844442c65a1172498c54971cb0 # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From 3e8e71f8b68f9ea843f57f5bebb9aad32700e0ac Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 23 Sep 2024 10:56:10 +0200 Subject: [PATCH 065/122] fix(ci): fixup checksum scanning pipeline (#3631) Signed-off-by: Ettore Di Giacinto --- .github/check_and_update.py | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/.github/check_and_update.py b/.github/check_and_update.py index dcf1d04a..704b658e 100644 --- a/.github/check_and_update.py +++ b/.github/check_and_update.py @@ -29,9 +29,14 @@ def calculate_sha256(file_path): def manual_safety_check_hf(repo_id): scanResponse = requests.get('https://huggingface.co/api/models/' + repo_id + "/scan") scan = scanResponse.json() - if scan['hasUnsafeFile']: - return scan - return None + # Check if 'hasUnsafeFile' exists in the response + if 'hasUnsafeFile' in scan: + if scan['hasUnsafeFile']: + return scan + else: + return None + else: + return None download_type, repo_id_or_url = parse_uri(uri) From 51cba89682b40ae92737fa47ce6bdbce9ba8cac6 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 23 Sep 2024 11:49:07 +0200 Subject: [PATCH 066/122] fix(hipblas): do not push all variants to hipblas builds (#3630) Like with CUDA builds, we don't need all the variants when we are compiling against the accelerated variants - in this way we save space and we avoid to exceed embedFS golang size limits. Signed-off-by: Ettore Di Giacinto --- Dockerfile | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Dockerfile b/Dockerfile index f08cb9a0..323c3d9a 100644 --- a/Dockerfile +++ b/Dockerfile @@ -297,10 +297,10 @@ COPY .git . RUN make prepare ## Build the binary -## If it's CUDA, we want to skip some of the llama-compat backends to save space -## We only leave the most CPU-optimized variant and the fallback for the cublas build -## (both will use CUDA for the actual computation) -RUN if [ "${BUILD_TYPE}" = "cublas" ]; then \ +## If it's CUDA or hipblas, we want to skip some of the llama-compat backends to save space +## We only leave the most CPU-optimized variant and the fallback for the cublas/hipblas build +## (both will use CUDA or hipblas for the actual computation) +RUN if [ "${BUILD_TYPE}" = "cublas" ] || [ "${BUILD_TYPE}" = "hipblas" ]; then \ SKIP_GRPC_BACKEND="backend-assets/grpc/llama-cpp-avx backend-assets/grpc/llama-cpp-avx2" make build; \ else \ make build; \ From bf8f8671d1b1daae8f1a1f446ab8f6366ddb4396 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 23 Sep 2024 19:04:36 +0200 Subject: [PATCH 067/122] chore(ci): adjust parallelism Signed-off-by: Ettore Di Giacinto --- .github/workflows/image.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/image.yml b/.github/workflows/image.yml index 8709f05c..6db8bb07 100644 --- a/.github/workflows/image.yml +++ b/.github/workflows/image.yml @@ -39,7 +39,7 @@ jobs: strategy: # Pushing with all jobs in parallel # eats the bandwidth of all the nodes - max-parallel: 1 + max-parallel: 2 matrix: include: - build-type: 'hipblas' From 1da8d8b9db431a62756dd2976d00531b316b0dfa Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 23 Sep 2024 19:09:51 +0200 Subject: [PATCH 068/122] models(gallery): add nightygurps-14b-v1.1 (#3633) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 7dab9eb7..1b84c403 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -654,6 +654,22 @@ - filename: Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf sha256: 9369eb97922a9f01e4eae610e3d7aaeca30762d78d9239884179451d60bdbdd2 uri: huggingface://bartowski/Llama3.1-8B-ShiningValiant2-GGUF/Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf +- !!merge <<: *llama31 + name: "nightygurps-14b-v1.1" + icon: https://cdn-uploads.huggingface.co/production/uploads/6336c5b3e3ac69e6a90581da/FvfjK7bKqsWdaBkB3eWgP.png + urls: + - https://huggingface.co/AlexBefest/NightyGurps-14b-v1.1 + - https://huggingface.co/bartowski/NightyGurps-14b-v1.1-GGUF + description: | + This model works with Russian only. + This model is designed to run GURPS roleplaying games, as well as consult and assist. This model was trained on an augmented dataset of the GURPS Basic Set rulebook. Its primary purpose was initially to become an assistant consultant and assistant Game Master for the GURPS roleplaying system, but it can also be used as a GM for running solo games as a player. + overrides: + parameters: + model: NightyGurps-14b-v1.1-Q4_K_M.gguf + files: + - filename: NightyGurps-14b-v1.1-Q4_K_M.gguf + sha256: d09d53259ad2c0298150fa8c2db98fe42f11731af89fdc80ad0e255a19adc4b0 + uri: huggingface://bartowski/NightyGurps-14b-v1.1-GGUF/NightyGurps-14b-v1.1-Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "humanish-roleplay-llama-3.1-8b-i1" From 26d99ed1c714652ae118e27768273b5b98e7bbf4 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 23 Sep 2024 19:12:54 +0200 Subject: [PATCH 069/122] models(gallery): add gemma-2-9b-arliai-rpmax-v1.1 (#3634) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 1b84c403..bddd6b16 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -2109,6 +2109,20 @@ - filename: Buddy-2B-v1-Q4_K_M.gguf sha256: 9bd25ed907d1a3c2e07fe09399a9b3aec107d368c29896e2c46facede5b7e3d5 uri: huggingface://bartowski/Buddy-2B-v1-GGUF/Buddy-2B-v1-Q4_K_M.gguf +- !!merge <<: *gemma + name: "gemma-2-9b-arliai-rpmax-v1.1" + urls: + - https://huggingface.co/ArliAI/Gemma-2-9B-ArliAI-RPMax-v1.1 + - https://huggingface.co/bartowski/Gemma-2-9B-ArliAI-RPMax-v1.1-GGUF + description: | + RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations. + overrides: + parameters: + model: Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf + files: + - filename: Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf + sha256: 1724aff0ad6f71bf4371d839aca55578f7ec6f030d8d25c0254126088e4c6250 + uri: huggingface://bartowski/Gemma-2-9B-ArliAI-RPMax-v1.1-GGUF/Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf - &llama3 url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master" icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png From e332ff80660fd3f23ecf67acd2807d22c9cafc85 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 23 Sep 2024 19:16:41 +0200 Subject: [PATCH 070/122] models(gallery): add gemma-2-2b-arliai-rpmax-v1.1 (#3635) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index bddd6b16..f75e448c 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -2123,6 +2123,19 @@ - filename: Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf sha256: 1724aff0ad6f71bf4371d839aca55578f7ec6f030d8d25c0254126088e4c6250 uri: huggingface://bartowski/Gemma-2-9B-ArliAI-RPMax-v1.1-GGUF/Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf +- !!merge <<: *gemma + name: "gemma-2-2b-arliai-rpmax-v1.1" + urls: + - https://huggingface.co/bartowski/Gemma-2-2B-ArliAI-RPMax-v1.1-GGUF + description: | + RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations. + overrides: + parameters: + model: Gemma-2-2B-ArliAI-RPMax-v1.1-Q4_K_M.gguf + files: + - filename: Gemma-2-2B-ArliAI-RPMax-v1.1-Q4_K_M.gguf + sha256: 89fe35345754d7e9de8d0c0d5bf35b2be9b12a09811b365b712b8b27112f7712 + uri: huggingface://bartowski/Gemma-2-2B-ArliAI-RPMax-v1.1-GGUF/Gemma-2-2B-ArliAI-RPMax-v1.1-Q4_K_M.gguf - &llama3 url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master" icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png From bbdf78615e72a8dfd5e80b9e1db1c804741fb4e5 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 23 Sep 2024 19:24:14 +0200 Subject: [PATCH 071/122] models(gallery): add acolyte-22b-i1 (#3636) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index f75e448c..9b8a0220 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1523,6 +1523,21 @@ - filename: Pantheon-RP-1.6-12b-Nemo-Q4_K_M.gguf sha256: cf3465c183bf4ecbccd1b6b480f687e0160475b04c87e2f1e5ebc8baa0f4c7aa uri: huggingface://bartowski/Pantheon-RP-1.6-12b-Nemo-GGUF/Pantheon-RP-1.6-12b-Nemo-Q4_K_M.gguf +- !!merge <<: *mistral03 + name: "acolyte-22b-i1" + icon: https://cdn-uploads.huggingface.co/production/uploads/6569a4ed2419be6072890cf8/3dcGMcrWK2-2vQh9QBt3o.png + urls: + - https://huggingface.co/rAIfle/Acolyte-22B + - https://huggingface.co/mradermacher/Acolyte-22B-i1-GGUF + description: | + LoRA of a bunch of random datasets on top of Mistral-Small-Instruct-2409, then SLERPed onto base at 0.5. Decent enough for its size. Check the LoRA for dataset info. + overrides: + parameters: + model: Acolyte-22B.i1-Q4_K_M.gguf + files: + - filename: Acolyte-22B.i1-Q4_K_M.gguf + sha256: 5a454405b98b6f886e8e4c695488d8ea098162bb8c46f2a7723fc2553c6e2f6e + uri: huggingface://mradermacher/Acolyte-22B-i1-GGUF/Acolyte-22B.i1-Q4_K_M.gguf - !!merge <<: *mistral03 name: "mn-12b-lyra-v4-iq-imatrix" icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/dVoru83WOpwVjMlgZ_xhA.png From 043cb94436ab44c30f160cc68423aa8915ec800f Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 23 Sep 2024 21:23:21 +0000 Subject: [PATCH 072/122] chore(deps): Bump yarl from 1.11.0 to 1.11.1 in /examples/langchain/langchainpy-localai-example (#3643) chore(deps): Bump yarl Bumps [yarl](https://github.com/aio-libs/yarl) from 1.11.0 to 1.11.1. - [Release notes](https://github.com/aio-libs/yarl/releases) - [Changelog](https://github.com/aio-libs/yarl/blob/master/CHANGES.rst) - [Commits](https://github.com/aio-libs/yarl/compare/v1.11.0...v1.11.1) --- updated-dependencies: - dependency-name: yarl dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 98325db3..3e4133ca 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -30,4 +30,4 @@ tqdm==4.66.5 typing-inspect==0.9.0 typing_extensions==4.12.2 urllib3==2.2.2 -yarl==1.11.0 +yarl==1.11.1 From cc6fac1688e5f700baa1d460106861abc7a1d2f4 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 24 Sep 2024 01:16:39 +0000 Subject: [PATCH 073/122] chore(deps): Bump urllib3 from 2.2.2 to 2.2.3 in /examples/langchain/langchainpy-localai-example (#3646) chore(deps): Bump urllib3 Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.2.2 to 2.2.3. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/2.2.2...2.2.3) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 3e4133ca..675429a3 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -29,5 +29,5 @@ tenacity==8.5.0 tqdm==4.66.5 typing-inspect==0.9.0 typing_extensions==4.12.2 -urllib3==2.2.2 +urllib3==2.2.3 yarl==1.11.1 From b8e129f2a6541a23f9c0b595ba12daa7e41a5a18 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 24 Sep 2024 02:53:35 +0000 Subject: [PATCH 074/122] chore(deps): Bump idna from 3.8 to 3.10 in /examples/langchain/langchainpy-localai-example (#3644) chore(deps): Bump idna Bumps [idna](https://github.com/kjd/idna) from 3.8 to 3.10. - [Release notes](https://github.com/kjd/idna/releases) - [Changelog](https://github.com/kjd/idna/blob/master/HISTORY.rst) - [Commits](https://github.com/kjd/idna/compare/v3.8...v3.10) --- updated-dependencies: - dependency-name: idna dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 675429a3..64a43bea 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -9,7 +9,7 @@ dataclasses-json==0.6.7 debugpy==1.8.2 frozenlist==1.4.1 greenlet==3.1.0 -idna==3.8 +idna==3.10 langchain==0.3.0 langchain-community==0.2.16 marshmallow==3.22.0 From c1752cbb831fe9ccb3dd113202884d4f670afbb7 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 24 Sep 2024 04:30:05 +0000 Subject: [PATCH 075/122] chore(deps): Bump sqlalchemy from 2.0.32 to 2.0.35 in /examples/langchain/langchainpy-localai-example (#3649) chore(deps): Bump sqlalchemy Bumps [sqlalchemy](https://github.com/sqlalchemy/sqlalchemy) from 2.0.32 to 2.0.35. - [Release notes](https://github.com/sqlalchemy/sqlalchemy/releases) - [Changelog](https://github.com/sqlalchemy/sqlalchemy/blob/main/CHANGES.rst) - [Commits](https://github.com/sqlalchemy/sqlalchemy/commits) --- updated-dependencies: - dependency-name: sqlalchemy dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 64a43bea..ac147410 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -24,7 +24,7 @@ packaging>=23.2 pydantic==2.8.2 PyYAML==6.0.2 requests==2.32.3 -SQLAlchemy==2.0.32 +SQLAlchemy==2.0.35 tenacity==8.5.0 tqdm==4.66.5 typing-inspect==0.9.0 From 69d2902b0a6e7647e16092118d73f779d80f266e Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Tue, 24 Sep 2024 09:31:28 +0200 Subject: [PATCH 076/122] chore: :arrow_up: Update ggerganov/llama.cpp to `f0c7b5edf82aa200656fd88c11ae3a805d7130bf` (#3653) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index fe086645..578656e5 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=c35e586ea57221844442c65a1172498c54971cb0 +CPPLLAMA_VERSION?=f0c7b5edf82aa200656fd88c11ae3a805d7130bf # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From 90cacb9692f3dc374766b0e32f75be8229a47db3 Mon Sep 17 00:00:00 2001 From: Dave Date: Tue, 24 Sep 2024 03:32:48 -0400 Subject: [PATCH 077/122] test: preliminary tests and merge fix for authv2 (#3584) * add api key to existing app tests, add preliminary auth test Signed-off-by: Dave Lee * small fix, run test Signed-off-by: Dave Lee * status on non-opaque Signed-off-by: Dave Lee * tweak auth error Signed-off-by: Dave Lee * exp Signed-off-by: Dave Lee * quick fix on real laptop Signed-off-by: Dave Lee * add downloader version that allows providing an auth header Signed-off-by: Dave Lee * stash some devcontainer fixes during testing Signed-off-by: Dave Lee * s2 Signed-off-by: Dave Lee * s Signed-off-by: Dave Lee * done with experiment Signed-off-by: Dave Lee * done with experiment Signed-off-by: Dave Lee * after merge fix Signed-off-by: Dave Lee * rename and fix Signed-off-by: Dave Lee --------- Signed-off-by: Dave Lee Co-authored-by: Ettore Di Giacinto --- .devcontainer-scripts/utils.sh | 2 + Dockerfile | 5 +-- Makefile | 3 ++ core/gallery/gallery.go | 4 +- core/gallery/models.go | 2 +- core/http/app.go | 18 --------- core/http/app_test.go | 69 ++++++++++++++++++++++++++++++---- core/http/middleware/auth.go | 3 +- embedded/embedded.go | 2 +- go.mod | 4 +- pkg/downloader/uri.go | 18 +++++++-- pkg/downloader/uri_test.go | 6 +-- 12 files changed, 95 insertions(+), 41 deletions(-) diff --git a/.devcontainer-scripts/utils.sh b/.devcontainer-scripts/utils.sh index 98ac063c..8416d43d 100644 --- a/.devcontainer-scripts/utils.sh +++ b/.devcontainer-scripts/utils.sh @@ -9,6 +9,7 @@ # Param 2: email # config_user() { + echo "Configuring git for $1 <$2>" local gcn=$(git config --global user.name) if [ -z "${gcn}" ]; then echo "Setting up git user / remote" @@ -24,6 +25,7 @@ config_user() { # Param 2: remote url # config_remote() { + echo "Adding git remote and fetching $2 as $1" local gr=$(git remote -v | grep $1) if [ -z "${gr}" ]; then git remote add $1 $2 diff --git a/Dockerfile b/Dockerfile index 323c3d9a..8c657469 100644 --- a/Dockerfile +++ b/Dockerfile @@ -338,9 +338,8 @@ RUN if [ "${FFMPEG}" = "true" ]; then \ RUN apt-get update && \ apt-get install -y --no-install-recommends \ - ssh less && \ - apt-get clean && \ - rm -rf /var/lib/apt/lists/* + ssh less wget +# For the devcontainer, leave apt functional in case additional devtools are needed at runtime. RUN go install github.com/go-delve/delve/cmd/dlv@latest diff --git a/Makefile b/Makefile index 578656e5..7523d5ff 100644 --- a/Makefile +++ b/Makefile @@ -359,6 +359,9 @@ clean-tests: rm -rf test-dir rm -rf core/http/backend-assets +clean-dc: clean + cp -r /build/backend-assets /workspace/backend-assets + ## Build: build: prepare backend-assets grpcs ## Build the project $(info ${GREEN}I local-ai build info:${RESET}) diff --git a/core/gallery/gallery.go b/core/gallery/gallery.go index 6ced6244..3a60e618 100644 --- a/core/gallery/gallery.go +++ b/core/gallery/gallery.go @@ -132,7 +132,7 @@ func AvailableGalleryModels(galleries []config.Gallery, basePath string) ([]*Gal func findGalleryURLFromReferenceURL(url string, basePath string) (string, error) { var refFile string uri := downloader.URI(url) - err := uri.DownloadAndUnmarshal(basePath, func(url string, d []byte) error { + err := uri.DownloadWithCallback(basePath, func(url string, d []byte) error { refFile = string(d) if len(refFile) == 0 { return fmt.Errorf("invalid reference file at url %s: %s", url, d) @@ -156,7 +156,7 @@ func getGalleryModels(gallery config.Gallery, basePath string) ([]*GalleryModel, } uri := downloader.URI(gallery.URL) - err := uri.DownloadAndUnmarshal(basePath, func(url string, d []byte) error { + err := uri.DownloadWithCallback(basePath, func(url string, d []byte) error { return yaml.Unmarshal(d, &models) }) if err != nil { diff --git a/core/gallery/models.go b/core/gallery/models.go index dec6312e..58f1963a 100644 --- a/core/gallery/models.go +++ b/core/gallery/models.go @@ -69,7 +69,7 @@ type PromptTemplate struct { func GetGalleryConfigFromURL(url string, basePath string) (Config, error) { var config Config uri := downloader.URI(url) - err := uri.DownloadAndUnmarshal(basePath, func(url string, d []byte) error { + err := uri.DownloadWithCallback(basePath, func(url string, d []byte) error { return yaml.Unmarshal(d, &config) }) if err != nil { diff --git a/core/http/app.go b/core/http/app.go index fa9cd866..23e97f18 100644 --- a/core/http/app.go +++ b/core/http/app.go @@ -31,24 +31,6 @@ import ( "github.com/rs/zerolog/log" ) -func readAuthHeader(c *fiber.Ctx) string { - authHeader := c.Get("Authorization") - - // elevenlabs - xApiKey := c.Get("xi-api-key") - if xApiKey != "" { - authHeader = "Bearer " + xApiKey - } - - // anthropic - xApiKey = c.Get("x-api-key") - if xApiKey != "" { - authHeader = "Bearer " + xApiKey - } - - return authHeader -} - // Embed a directory // //go:embed static/* diff --git a/core/http/app_test.go b/core/http/app_test.go index 86fe7fdd..bbe52c34 100644 --- a/core/http/app_test.go +++ b/core/http/app_test.go @@ -31,6 +31,9 @@ import ( "github.com/sashabaranov/go-openai/jsonschema" ) +const apiKey = "joshua" +const bearerKey = "Bearer " + apiKey + const testPrompt = `### System: You are an AI assistant that follows instruction extremely well. Help as much as you can. @@ -50,11 +53,19 @@ type modelApplyRequest struct { func getModelStatus(url string) (response map[string]interface{}) { // Create the HTTP request - resp, err := http.Get(url) + req, err := http.NewRequest("GET", url, nil) + req.Header.Set("Content-Type", "application/json") + req.Header.Set("Authorization", bearerKey) if err != nil { fmt.Println("Error creating request:", err) return } + client := &http.Client{} + resp, err := client.Do(req) + if err != nil { + fmt.Println("Error sending request:", err) + return + } defer resp.Body.Close() body, err := io.ReadAll(resp.Body) @@ -72,14 +83,15 @@ func getModelStatus(url string) (response map[string]interface{}) { return } -func getModels(url string) (response []gallery.GalleryModel) { +func getModels(url string) ([]gallery.GalleryModel, error) { + response := []gallery.GalleryModel{} uri := downloader.URI(url) // TODO: No tests currently seem to exercise file:// urls. Fix? - uri.DownloadAndUnmarshal("", func(url string, i []byte) error { + err := uri.DownloadWithAuthorizationAndCallback("", bearerKey, func(url string, i []byte) error { // Unmarshal YAML data into a struct return json.Unmarshal(i, &response) }) - return + return response, err } func postModelApplyRequest(url string, request modelApplyRequest) (response map[string]interface{}) { @@ -101,6 +113,7 @@ func postModelApplyRequest(url string, request modelApplyRequest) (response map[ return } req.Header.Set("Content-Type", "application/json") + req.Header.Set("Authorization", bearerKey) // Make the request client := &http.Client{} @@ -140,6 +153,7 @@ func postRequestJSON[B any](url string, bodyJson *B) error { } req.Header.Set("Content-Type", "application/json") + req.Header.Set("Authorization", bearerKey) client := &http.Client{} resp, err := client.Do(req) @@ -175,6 +189,7 @@ func postRequestResponseJSON[B1 any, B2 any](url string, reqJson *B1, respJson * } req.Header.Set("Content-Type", "application/json") + req.Header.Set("Authorization", bearerKey) client := &http.Client{} resp, err := client.Do(req) @@ -195,6 +210,35 @@ func postRequestResponseJSON[B1 any, B2 any](url string, reqJson *B1, respJson * return json.Unmarshal(body, respJson) } +func postInvalidRequest(url string) (error, int) { + + req, err := http.NewRequest("POST", url, bytes.NewBufferString("invalid request")) + if err != nil { + return err, -1 + } + + req.Header.Set("Content-Type", "application/json") + + client := &http.Client{} + resp, err := client.Do(req) + if err != nil { + return err, -1 + } + + defer resp.Body.Close() + + body, err := io.ReadAll(resp.Body) + if err != nil { + return err, -1 + } + + if resp.StatusCode < 200 || resp.StatusCode >= 400 { + return fmt.Errorf("unexpected status code: %d, body: %s", resp.StatusCode, string(body)), resp.StatusCode + } + + return nil, resp.StatusCode +} + //go:embed backend-assets/* var backendAssets embed.FS @@ -260,6 +304,7 @@ var _ = Describe("API test", func() { config.WithContext(c), config.WithGalleries(galleries), config.WithModelPath(modelDir), + config.WithApiKeys([]string{apiKey}), config.WithBackendAssets(backendAssets), config.WithBackendAssetsOutput(backendAssetsDir))...) Expect(err).ToNot(HaveOccurred()) @@ -269,7 +314,7 @@ var _ = Describe("API test", func() { go app.Listen("127.0.0.1:9090") - defaultConfig := openai.DefaultConfig("") + defaultConfig := openai.DefaultConfig(apiKey) defaultConfig.BaseURL = "http://127.0.0.1:9090/v1" client2 = openaigo.NewClient("") @@ -295,10 +340,19 @@ var _ = Describe("API test", func() { Expect(err).To(HaveOccurred()) }) + Context("Auth Tests", func() { + It("Should fail if the api key is missing", func() { + err, sc := postInvalidRequest("http://127.0.0.1:9090/models/available") + Expect(err).ToNot(BeNil()) + Expect(sc).To(Equal(403)) + }) + }) + Context("Applying models", func() { It("applies models from a gallery", func() { - models := getModels("http://127.0.0.1:9090/models/available") + models, err := getModels("http://127.0.0.1:9090/models/available") + Expect(err).To(BeNil()) Expect(len(models)).To(Equal(2), fmt.Sprint(models)) Expect(models[0].Installed).To(BeFalse(), fmt.Sprint(models)) Expect(models[1].Installed).To(BeFalse(), fmt.Sprint(models)) @@ -331,7 +385,8 @@ var _ = Describe("API test", func() { Expect(content["backend"]).To(Equal("bert-embeddings")) Expect(content["foo"]).To(Equal("bar")) - models = getModels("http://127.0.0.1:9090/models/available") + models, err = getModels("http://127.0.0.1:9090/models/available") + Expect(err).To(BeNil()) Expect(len(models)).To(Equal(2), fmt.Sprint(models)) Expect(models[0].Name).To(Or(Equal("bert"), Equal("bert2"))) Expect(models[1].Name).To(Or(Equal("bert"), Equal("bert2"))) diff --git a/core/http/middleware/auth.go b/core/http/middleware/auth.go index bc8bcf80..d2152e9b 100644 --- a/core/http/middleware/auth.go +++ b/core/http/middleware/auth.go @@ -38,6 +38,7 @@ func getApiKeyErrorHandler(applicationConfig *config.ApplicationConfig) fiber.Er if applicationConfig.OpaqueErrors { return ctx.SendStatus(403) } + return ctx.Status(403).SendString(err.Error()) } if applicationConfig.OpaqueErrors { return ctx.SendStatus(500) @@ -90,4 +91,4 @@ func getApiKeyRequiredFilterFunction(applicationConfig *config.ApplicationConfig } } return func(c *fiber.Ctx) bool { return false } -} \ No newline at end of file +} diff --git a/embedded/embedded.go b/embedded/embedded.go index 672c32ed..3a4ea262 100644 --- a/embedded/embedded.go +++ b/embedded/embedded.go @@ -39,7 +39,7 @@ func init() { func GetRemoteLibraryShorteners(url string, basePath string) (map[string]string, error) { remoteLibrary := map[string]string{} uri := downloader.URI(url) - err := uri.DownloadAndUnmarshal(basePath, func(_ string, i []byte) error { + err := uri.DownloadWithCallback(basePath, func(_ string, i []byte) error { return yaml.Unmarshal(i, &remoteLibrary) }) if err != nil { diff --git a/go.mod b/go.mod index a3359abf..dd8fce9f 100644 --- a/go.mod +++ b/go.mod @@ -1,8 +1,8 @@ module github.com/mudler/LocalAI -go 1.22.0 +go 1.23 -toolchain go1.22.4 +toolchain go1.23.1 require ( dario.cat/mergo v1.0.0 diff --git a/pkg/downloader/uri.go b/pkg/downloader/uri.go index 7fedd646..9acbb621 100644 --- a/pkg/downloader/uri.go +++ b/pkg/downloader/uri.go @@ -31,7 +31,11 @@ const ( type URI string -func (uri URI) DownloadAndUnmarshal(basePath string, f func(url string, i []byte) error) error { +func (uri URI) DownloadWithCallback(basePath string, f func(url string, i []byte) error) error { + return uri.DownloadWithAuthorizationAndCallback(basePath, "", f) +} + +func (uri URI) DownloadWithAuthorizationAndCallback(basePath string, authorization string, f func(url string, i []byte) error) error { url := uri.ResolveURL() if strings.HasPrefix(url, LocalPrefix) { @@ -41,7 +45,6 @@ func (uri URI) DownloadAndUnmarshal(basePath string, f func(url string, i []byte if err != nil { return err } - // ??? resolvedBasePath, err := filepath.EvalSymlinks(basePath) if err != nil { return err @@ -63,7 +66,16 @@ func (uri URI) DownloadAndUnmarshal(basePath string, f func(url string, i []byte } // Send a GET request to the URL - response, err := http.Get(url) + + req, err := http.NewRequest("GET", url, nil) + if err != nil { + return err + } + if authorization != "" { + req.Header.Add("Authorization", authorization) + } + + response, err := http.DefaultClient.Do(req) if err != nil { return err } diff --git a/pkg/downloader/uri_test.go b/pkg/downloader/uri_test.go index 21a093a9..3b7a80b3 100644 --- a/pkg/downloader/uri_test.go +++ b/pkg/downloader/uri_test.go @@ -11,7 +11,7 @@ var _ = Describe("Gallery API tests", func() { It("parses github with a branch", func() { uri := URI("github:go-skynet/model-gallery/gpt4all-j.yaml") Expect( - uri.DownloadAndUnmarshal("", func(url string, i []byte) error { + uri.DownloadWithCallback("", func(url string, i []byte) error { Expect(url).To(Equal("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml")) return nil }), @@ -21,7 +21,7 @@ var _ = Describe("Gallery API tests", func() { uri := URI("github:go-skynet/model-gallery/gpt4all-j.yaml@main") Expect( - uri.DownloadAndUnmarshal("", func(url string, i []byte) error { + uri.DownloadWithCallback("", func(url string, i []byte) error { Expect(url).To(Equal("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml")) return nil }), @@ -30,7 +30,7 @@ var _ = Describe("Gallery API tests", func() { It("parses github with urls", func() { uri := URI("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml") Expect( - uri.DownloadAndUnmarshal("", func(url string, i []byte) error { + uri.DownloadWithCallback("", func(url string, i []byte) error { Expect(url).To(Equal("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml")) return nil }), From 0893d3cbbebc6f7c5fa1d65e4b17e7d900ae60d4 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 24 Sep 2024 20:25:59 +0200 Subject: [PATCH 078/122] fix(health): do not require auth for /healthz and /readyz (#3656) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix(health): do not require auth for /healthz and /readyz Fixes: #3655 Signed-off-by: Ettore Di Giacinto * Comment so I don’t forget Adding a reminder here... --------- Signed-off-by: Ettore Di Giacinto Co-authored-by: Dave --- core/http/app.go | 3 +++ core/http/routes/health.go | 13 +++++++++++++ core/http/routes/localai.go | 8 -------- 3 files changed, 16 insertions(+), 8 deletions(-) create mode 100644 core/http/routes/health.go diff --git a/core/http/app.go b/core/http/app.go index 23e97f18..2cf0ad17 100644 --- a/core/http/app.go +++ b/core/http/app.go @@ -121,6 +121,9 @@ func App(cl *config.BackendConfigLoader, ml *model.ModelLoader, appConfig *confi }) } + // Health Checks should always be exempt from auth, so register these first + routes.HealthRoutes(app) + kaConfig, err := middleware.GetKeyAuthConfig(appConfig) if err != nil || kaConfig == nil { return nil, fmt.Errorf("failed to create key auth config: %w", err) diff --git a/core/http/routes/health.go b/core/http/routes/health.go new file mode 100644 index 00000000..f5a08e9b --- /dev/null +++ b/core/http/routes/health.go @@ -0,0 +1,13 @@ +package routes + +import "github.com/gofiber/fiber/v2" + +func HealthRoutes(app *fiber.App) { + // Service health checks + ok := func(c *fiber.Ctx) error { + return c.SendStatus(200) + } + + app.Get("/healthz", ok) + app.Get("/readyz", ok) +} diff --git a/core/http/routes/localai.go b/core/http/routes/localai.go index 247596c0..2f65e779 100644 --- a/core/http/routes/localai.go +++ b/core/http/routes/localai.go @@ -42,14 +42,6 @@ func RegisterLocalAIRoutes(app *fiber.App, app.Post("/stores/get", localai.StoresGetEndpoint(sl, appConfig)) app.Post("/stores/find", localai.StoresFindEndpoint(sl, appConfig)) - // Kubernetes health checks - ok := func(c *fiber.Ctx) error { - return c.SendStatus(200) - } - - app.Get("/healthz", ok) - app.Get("/readyz", ok) - app.Get("/metrics", localai.LocalAIMetricsEndpoint()) // Experimental Backend Statistics Module From 6555994060662db4c3600c0d51b18e10f5cac890 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 24 Sep 2024 21:22:08 +0000 Subject: [PATCH 079/122] chore(deps): Bump sentence-transformers from 3.1.0 to 3.1.1 in /backend/python/sentencetransformers (#3651) chore(deps): Bump sentence-transformers Bumps [sentence-transformers](https://github.com/UKPLab/sentence-transformers) from 3.1.0 to 3.1.1. - [Release notes](https://github.com/UKPLab/sentence-transformers/releases) - [Commits](https://github.com/UKPLab/sentence-transformers/compare/v3.1.0...v3.1.1) --- updated-dependencies: - dependency-name: sentence-transformers dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/sentencetransformers/requirements-cpu.txt | 2 +- backend/python/sentencetransformers/requirements-cublas11.txt | 2 +- backend/python/sentencetransformers/requirements-cublas12.txt | 2 +- backend/python/sentencetransformers/requirements-hipblas.txt | 2 +- backend/python/sentencetransformers/requirements-intel.txt | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/backend/python/sentencetransformers/requirements-cpu.txt b/backend/python/sentencetransformers/requirements-cpu.txt index f88de1e4..0fd8f35e 100644 --- a/backend/python/sentencetransformers/requirements-cpu.txt +++ b/backend/python/sentencetransformers/requirements-cpu.txt @@ -2,5 +2,5 @@ torch accelerate transformers bitsandbytes -sentence-transformers==3.1.0 +sentence-transformers==3.1.1 transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-cublas11.txt b/backend/python/sentencetransformers/requirements-cublas11.txt index 57caf1a1..92a10b16 100644 --- a/backend/python/sentencetransformers/requirements-cublas11.txt +++ b/backend/python/sentencetransformers/requirements-cublas11.txt @@ -1,5 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/cu118 torch accelerate -sentence-transformers==3.1.0 +sentence-transformers==3.1.1 transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-cublas12.txt b/backend/python/sentencetransformers/requirements-cublas12.txt index 834fa6a4..f68bb1b9 100644 --- a/backend/python/sentencetransformers/requirements-cublas12.txt +++ b/backend/python/sentencetransformers/requirements-cublas12.txt @@ -1,4 +1,4 @@ torch accelerate -sentence-transformers==3.1.0 +sentence-transformers==3.1.1 transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-hipblas.txt b/backend/python/sentencetransformers/requirements-hipblas.txt index 98a0a41b..920eb855 100644 --- a/backend/python/sentencetransformers/requirements-hipblas.txt +++ b/backend/python/sentencetransformers/requirements-hipblas.txt @@ -1,5 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 torch accelerate -sentence-transformers==3.1.0 +sentence-transformers==3.1.1 transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-intel.txt b/backend/python/sentencetransformers/requirements-intel.txt index 5948910d..6ae4bdd4 100644 --- a/backend/python/sentencetransformers/requirements-intel.txt +++ b/backend/python/sentencetransformers/requirements-intel.txt @@ -4,5 +4,5 @@ torch optimum[openvino] setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 accelerate -sentence-transformers==3.1.0 +sentence-transformers==3.1.1 transformers \ No newline at end of file From c54cfd3609489c648859736b3038a322339a8bfd Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 24 Sep 2024 22:59:11 +0000 Subject: [PATCH 080/122] chore(deps): Bump pydantic from 2.8.2 to 2.9.2 in /examples/langchain/langchainpy-localai-example (#3648) chore(deps): Bump pydantic Bumps [pydantic](https://github.com/pydantic/pydantic) from 2.8.2 to 2.9.2. - [Release notes](https://github.com/pydantic/pydantic/releases) - [Changelog](https://github.com/pydantic/pydantic/blob/main/HISTORY.md) - [Commits](https://github.com/pydantic/pydantic/compare/v2.8.2...v2.9.2) --- updated-dependencies: - dependency-name: pydantic dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index ac147410..179abc2a 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -21,7 +21,7 @@ numpy==2.1.1 openai==1.45.1 openapi-schema-pydantic==1.2.4 packaging>=23.2 -pydantic==2.8.2 +pydantic==2.9.2 PyYAML==6.0.2 requests==2.32.3 SQLAlchemy==2.0.35 From 0d784f46e55e39fb988c171c32ef664c9ff2801c Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Wed, 25 Sep 2024 01:15:53 +0000 Subject: [PATCH 081/122] chore(deps): Bump openai from 1.45.1 to 1.47.1 in /examples/functions (#3645) Bumps [openai](https://github.com/openai/openai-python) from 1.45.1 to 1.47.1. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.45.1...v1.47.1) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/functions/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt index 670090d3..c3ffad01 100644 --- a/examples/functions/requirements.txt +++ b/examples/functions/requirements.txt @@ -1,2 +1,2 @@ langchain==0.3.0 -openai==1.45.1 +openai==1.47.1 From aa87eff28330a65818842884515ca1806165c209 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Wed, 25 Sep 2024 06:51:20 +0200 Subject: [PATCH 082/122] chore: :arrow_up: Update ggerganov/llama.cpp to `70392f1f81470607ba3afef04aa56c9f65587664` (#3659) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Dave --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 7523d5ff..6865f5a1 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=f0c7b5edf82aa200656fd88c11ae3a805d7130bf +CPPLLAMA_VERSION?=70392f1f81470607ba3afef04aa56c9f65587664 # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From a370a11115879a9e02410f55136f563391976254 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Wed, 25 Sep 2024 08:47:03 +0200 Subject: [PATCH 083/122] docs: :arrow_up: update docs version mudler/LocalAI (#3657) :arrow_up: Update docs version mudler/LocalAI Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- docs/data/version.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/data/version.json b/docs/data/version.json index dc128c66..0dba0428 100644 --- a/docs/data/version.json +++ b/docs/data/version.json @@ -1,3 +1,3 @@ { - "version": "v2.20.1" + "version": "v2.21.0" } From 1b8a77433a88ce1a56d364b4dc81d9030f4e2830 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Wed, 25 Sep 2024 08:47:33 +0200 Subject: [PATCH 084/122] chore(deps): Bump llama-index from 0.11.7 to 0.11.12 in /examples/langchain-chroma (#3639) chore(deps): Bump llama-index in /examples/langchain-chroma Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.7 to 0.11.12. - [Release notes](https://github.com/run-llama/llama_index/releases) - [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md) - [Commits](https://github.com/run-llama/llama_index/compare/v0.11.7...v0.11.12) --- updated-dependencies: - dependency-name: llama-index dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 4884d4aa..3f7bec69 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.3.0 openai==1.45.1 chromadb==0.5.5 -llama-index==0.11.7 \ No newline at end of file +llama-index==0.11.12 \ No newline at end of file From 8002ad27cb7b67f8489a5f3cda66437acf2aac74 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Wed, 25 Sep 2024 08:47:57 +0200 Subject: [PATCH 085/122] chore(deps): Bump openai from 1.45.1 to 1.47.1 in /examples/langchain-chroma (#3641) chore(deps): Bump openai in /examples/langchain-chroma Bumps [openai](https://github.com/openai/openai-python) from 1.45.1 to 1.47.1. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.45.1...v1.47.1) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 3f7bec69..0c77892d 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.3.0 -openai==1.45.1 +openai==1.47.1 chromadb==0.5.5 llama-index==0.11.12 \ No newline at end of file From 8c4f720fb578b3156c333448f298a55845857c58 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Wed, 25 Sep 2024 08:48:13 +0200 Subject: [PATCH 086/122] chore(deps): Bump llama-index from 0.11.9 to 0.11.12 in /examples/chainlit (#3642) chore(deps): Bump llama-index in /examples/chainlit Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.9 to 0.11.12. - [Release notes](https://github.com/run-llama/llama_index/releases) - [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md) - [Commits](https://github.com/run-llama/llama_index/compare/v0.11.9...v0.11.12) --- updated-dependencies: - dependency-name: llama-index dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/chainlit/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/chainlit/requirements.txt b/examples/chainlit/requirements.txt index 1fe9356a..92eb113e 100644 --- a/examples/chainlit/requirements.txt +++ b/examples/chainlit/requirements.txt @@ -1,4 +1,4 @@ -llama_index==0.11.9 +llama_index==0.11.12 requests==2.32.3 weaviate_client==4.8.1 transformers From 74408bdc77e9f9d21a56699de09940fcaaf1a4eb Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Wed, 25 Sep 2024 10:54:37 +0200 Subject: [PATCH 087/122] chore: :arrow_up: Update ggerganov/whisper.cpp to `0d2e2aed80109e8696791083bde3b58e190b7812` (#3658) :arrow_up: Update ggerganov/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Dave Co-authored-by: Ettore Di Giacinto --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 6865f5a1..121b8e50 100644 --- a/Makefile +++ b/Makefile @@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6 # whisper.cpp version WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp -WHISPER_CPP_VERSION?=34972dbe221709323714fc8402f2e24041d48213 +WHISPER_CPP_VERSION?=0d2e2aed80109e8696791083bde3b58e190b7812 # bert.cpp version BERT_REPO?=https://github.com/go-skynet/go-bert.cpp From 33b2d38dd0198d78dbc26aa020acfb6ff4c4048c Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Wed, 25 Sep 2024 12:44:32 +0200 Subject: [PATCH 088/122] chore(deps): Bump chromadb from 0.5.5 to 0.5.7 in /examples/langchain-chroma (#3640) chore(deps): Bump chromadb in /examples/langchain-chroma Bumps [chromadb](https://github.com/chroma-core/chroma) from 0.5.5 to 0.5.7. - [Release notes](https://github.com/chroma-core/chroma/releases) - [Changelog](https://github.com/chroma-core/chroma/blob/main/RELEASE_PROCESS.md) - [Commits](https://github.com/chroma-core/chroma/compare/0.5.5...0.5.7) --- updated-dependencies: - dependency-name: chromadb dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 0c77892d..19929482 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.3.0 openai==1.47.1 -chromadb==0.5.5 +chromadb==0.5.7 llama-index==0.11.12 \ No newline at end of file From a3d69872e35e152f29f7888fa9c56b0a797e9723 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 25 Sep 2024 18:00:23 +0200 Subject: [PATCH 089/122] feat(api): list loaded models in `/system` (#3661) feat(api): list loaded models in /system Signed-off-by: Ettore Di Giacinto --- core/http/endpoints/localai/system.go | 2 ++ core/schema/localai.go | 4 +++- pkg/model/initializers.go | 7 +++---- pkg/model/loader.go | 6 +++--- pkg/model/loader_test.go | 4 ++-- pkg/model/model.go | 4 +++- 6 files changed, 16 insertions(+), 11 deletions(-) diff --git a/core/http/endpoints/localai/system.go b/core/http/endpoints/localai/system.go index 11704933..23a725e3 100644 --- a/core/http/endpoints/localai/system.go +++ b/core/http/endpoints/localai/system.go @@ -17,12 +17,14 @@ func SystemInformations(ml *model.ModelLoader, appConfig *config.ApplicationConf if err != nil { return err } + loadedModels := ml.ListModels() for b := range appConfig.ExternalGRPCBackends { availableBackends = append(availableBackends, b) } return c.JSON( schema.SystemInformationResponse{ Backends: availableBackends, + Models: loadedModels, }, ) } diff --git a/core/schema/localai.go b/core/schema/localai.go index 9070c2be..75fa40c7 100644 --- a/core/schema/localai.go +++ b/core/schema/localai.go @@ -2,6 +2,7 @@ package schema import ( "github.com/mudler/LocalAI/core/p2p" + "github.com/mudler/LocalAI/pkg/model" gopsutil "github.com/shirou/gopsutil/v3/process" ) @@ -72,5 +73,6 @@ type P2PNodesResponse struct { } type SystemInformationResponse struct { - Backends []string `json:"backends"` + Backends []string `json:"backends"` + Models []model.Model `json:"loaded_models"` } diff --git a/pkg/model/initializers.go b/pkg/model/initializers.go index 7099bf33..80dd10b4 100644 --- a/pkg/model/initializers.go +++ b/pkg/model/initializers.go @@ -311,11 +311,11 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string log.Debug().Msgf("GRPC Service Started") - client = NewModel(serverAddress) + client = NewModel(modelName, serverAddress) } else { log.Debug().Msg("external backend is uri") // address - client = NewModel(uri) + client = NewModel(modelName, uri) } } else { grpcProcess := backendPath(o.assetDir, backend) @@ -352,7 +352,7 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string log.Debug().Msgf("GRPC Service Started") - client = NewModel(serverAddress) + client = NewModel(modelName, serverAddress) } log.Debug().Msgf("Wait for the service to start up") @@ -419,7 +419,6 @@ func (ml *ModelLoader) BackendLoader(opts ...Option) (client grpc.Backend, err e err := ml.StopGRPC(allExcept(o.model)) if err != nil { log.Error().Err(err).Str("keptModel", o.model).Msg("error while shutting down all backends except for the keptModel") - return nil, err } } diff --git a/pkg/model/loader.go b/pkg/model/loader.go index f70d2cea..4f1ec841 100644 --- a/pkg/model/loader.go +++ b/pkg/model/loader.go @@ -105,13 +105,13 @@ FILE: return models, nil } -func (ml *ModelLoader) ListModels() []*Model { +func (ml *ModelLoader) ListModels() []Model { ml.mu.Lock() defer ml.mu.Unlock() - models := []*Model{} + models := []Model{} for _, model := range ml.models { - models = append(models, model) + models = append(models, *model) } return models diff --git a/pkg/model/loader_test.go b/pkg/model/loader_test.go index 4621844e..c16a6e50 100644 --- a/pkg/model/loader_test.go +++ b/pkg/model/loader_test.go @@ -63,7 +63,7 @@ var _ = Describe("ModelLoader", func() { Context("LoadModel", func() { It("should load a model and keep it in memory", func() { - mockModel = model.NewModel("test.model") + mockModel = model.NewModel("foo", "test.model") mockLoader := func(modelName, modelFile string) (*model.Model, error) { return mockModel, nil @@ -88,7 +88,7 @@ var _ = Describe("ModelLoader", func() { Context("ShutdownModel", func() { It("should shutdown a loaded model", func() { - mockModel = model.NewModel("test.model") + mockModel = model.NewModel("foo", "test.model") mockLoader := func(modelName, modelFile string) (*model.Model, error) { return mockModel, nil diff --git a/pkg/model/model.go b/pkg/model/model.go index 1927dc0c..6cb81d10 100644 --- a/pkg/model/model.go +++ b/pkg/model/model.go @@ -3,12 +3,14 @@ package model import grpc "github.com/mudler/LocalAI/pkg/grpc" type Model struct { + ID string `json:"id"` address string client grpc.Backend } -func NewModel(address string) *Model { +func NewModel(ID, address string) *Model { return &Model{ + ID: ID, address: address, } } From ef1507d000f2308f395a341d6c497de70427f1a5 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Thu, 26 Sep 2024 10:50:20 +0200 Subject: [PATCH 090/122] docs: :arrow_up: update docs version mudler/LocalAI (#3665) :arrow_up: Update docs version mudler/LocalAI Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- docs/data/version.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/data/version.json b/docs/data/version.json index 0dba0428..470991b8 100644 --- a/docs/data/version.json +++ b/docs/data/version.json @@ -1,3 +1,3 @@ { - "version": "v2.21.0" + "version": "v2.21.1" } From d6522e69ca0f972b2d0d8f617b1cc131ac5026c6 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Thu, 26 Sep 2024 10:57:40 +0200 Subject: [PATCH 091/122] feat(swagger): update swagger (#3664) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- swagger/docs.go | 14 ++++++++++++++ swagger/swagger.json | 14 ++++++++++++++ swagger/swagger.yaml | 9 +++++++++ 3 files changed, 37 insertions(+) diff --git a/swagger/docs.go b/swagger/docs.go index ffb2ba03..c283dcb0 100644 --- a/swagger/docs.go +++ b/swagger/docs.go @@ -972,6 +972,14 @@ const docTemplate = `{ } } }, + "model.Model": { + "type": "object", + "properties": { + "id": { + "type": "string" + } + } + }, "openai.Assistant": { "type": "object", "properties": { @@ -1682,6 +1690,12 @@ const docTemplate = `{ "items": { "type": "string" } + }, + "loaded_models": { + "type": "array", + "items": { + "$ref": "#/definitions/model.Model" + } } } }, diff --git a/swagger/swagger.json b/swagger/swagger.json index e3aebe43..0a3be179 100644 --- a/swagger/swagger.json +++ b/swagger/swagger.json @@ -965,6 +965,14 @@ } } }, + "model.Model": { + "type": "object", + "properties": { + "id": { + "type": "string" + } + } + }, "openai.Assistant": { "type": "object", "properties": { @@ -1675,6 +1683,12 @@ "items": { "type": "string" } + }, + "loaded_models": { + "type": "array", + "items": { + "$ref": "#/definitions/model.Model" + } } } }, diff --git a/swagger/swagger.yaml b/swagger/swagger.yaml index 649b86e4..7b6619b4 100644 --- a/swagger/swagger.yaml +++ b/swagger/swagger.yaml @@ -168,6 +168,11 @@ definitions: type: string type: array type: object + model.Model: + properties: + id: + type: string + type: object openai.Assistant: properties: created: @@ -652,6 +657,10 @@ definitions: items: type: string type: array + loaded_models: + items: + $ref: '#/definitions/model.Model' + type: array type: object schema.TTSRequest: description: TTS request body From 3d12d2037c83f9d5d3ae832e97311b29547532e1 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 26 Sep 2024 11:19:26 +0200 Subject: [PATCH 092/122] models(gallery): add llama-3.2 3B and 1B (#3671) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 60 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 9b8a0220..de38c3d5 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1,4 +1,64 @@ --- +## llama3.2 +- &llama32 + url: "github:mudler/LocalAI/gallery/llama3.1-instruct.yaml@master" + icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png + license: llama3.2 + description: | + The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks. + + Model Developer: Meta + + Model Architecture: Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. + tags: + - llm + - gguf + - gpu + - cpu + - llama3.2 + name: "llama-3.2-1b-instruct:q4_k_m" + urls: + - https://huggingface.co/hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF + overrides: + parameters: + model: llama-3.2-1b-instruct-q4_k_m.gguf + files: + - filename: llama-3.2-1b-instruct-q4_k_m.gguf + sha256: 1d0e9419ec4e12aef73ccf4ffd122703e94c48344a96bc7c5f0f2772c2152ce3 + uri: huggingface://hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF/llama-3.2-1b-instruct-q4_k_m.gguf +- !!merge <<: *llama32 + name: "llama-3.2-3b-instruct:q4_k_m" + urls: + - https://huggingface.co/hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF + overrides: + parameters: + model: llama-3.2-3b-instruct-q4_k_m.gguf + files: + - filename: llama-3.2-3b-instruct-q4_k_m.gguf + sha256: c55a83bfb6396799337853ca69918a0b9bbb2917621078c34570bc17d20fd7a1 + uri: huggingface://hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF/llama-3.2-3b-instruct-q4_k_m.gguf +- !!merge <<: *llama32 + name: "llama-3.2-3b-instruct:q8_0" + urls: + - https://huggingface.co/hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF + overrides: + parameters: + model: llama-3.2-3b-instruct-q8_0.gguf + files: + - filename: llama-3.2-3b-instruct-q8_0.gguf + sha256: 51725f77f997a5080c3d8dd66e073da22ddf48ab5264f21f05ded9b202c3680e + uri: huggingface://hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF/llama-3.2-3b-instruct-q8_0.gguf +- !!merge <<: *llama32 + name: "llama-3.2-1b-instruct:q8_0" + urls: + - https://huggingface.co/hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF + overrides: + parameters: + model: llama-3.2-1b-instruct-q8_0.gguf + files: + - filename: llama-3.2-1b-instruct-q8_0.gguf + sha256: ba345c83bf5cc679c653b853c46517eea5a34f03ed2205449db77184d9ae62a9 + uri: huggingface://hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF/llama-3.2-1b-instruct-q8_0.gguf ## Qwen2.5 - &qwen25 name: "qwen2.5-14b-instruct" From fa5c98549aae32df63a9c3e34574701e45287d29 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 26 Sep 2024 12:44:55 +0200 Subject: [PATCH 093/122] chore(refactor): track grpcProcess in the model structure (#3663) * chore(refactor): track grpcProcess in the model structure This avoids to have to handle in two parts the data relative to the same model. It makes it easier to track and use mutex with. This also fixes races conditions while accessing to the model. Signed-off-by: Ettore Di Giacinto * chore(tests): run protogen-go before starting aio tests Signed-off-by: Ettore Di Giacinto * chore(tests): install protoc in aio tests Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Ettore Di Giacinto --- .github/workflows/test.yml | 11 ++++++++++- Makefile | 2 +- pkg/model/initializers.go | 15 ++++++++++----- pkg/model/loader.go | 32 ++++++++++++++------------------ pkg/model/loader_test.go | 4 ++-- pkg/model/model.go | 18 ++++++++++++++++-- pkg/model/process.go | 33 ++++++++++++++++++--------------- 7 files changed, 71 insertions(+), 44 deletions(-) diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 2af3fd00..b62f86ef 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -178,13 +178,22 @@ jobs: uses: actions/checkout@v4 with: submodules: true + - name: Dependencies + run: | + # Install protoc + curl -L -s https://github.com/protocolbuffers/protobuf/releases/download/v26.1/protoc-26.1-linux-x86_64.zip -o protoc.zip && \ + unzip -j -d /usr/local/bin protoc.zip bin/protoc && \ + rm protoc.zip + go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2 + go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@1958fcbe2ca8bd93af633f11e97d44e567e945af + PATH="$PATH:$HOME/go/bin" make protogen-go - name: Build images run: | docker build --build-arg FFMPEG=true --build-arg IMAGE_TYPE=extras --build-arg EXTRA_BACKENDS=rerankers --build-arg MAKEFLAGS="--jobs=5 --output-sync=target" -t local-ai:tests -f Dockerfile . BASE_IMAGE=local-ai:tests DOCKER_AIO_IMAGE=local-ai-aio:test make docker-aio - name: Test run: | - LOCALAI_MODELS_DIR=$PWD/models LOCALAI_IMAGE_TAG=test LOCALAI_IMAGE=local-ai-aio \ + PATH="$PATH:$HOME/go/bin" LOCALAI_MODELS_DIR=$PWD/models LOCALAI_IMAGE_TAG=test LOCALAI_IMAGE=local-ai-aio \ make run-e2e-aio - name: Setup tmate session if tests fail if: ${{ failure() }} diff --git a/Makefile b/Makefile index 121b8e50..4efee986 100644 --- a/Makefile +++ b/Makefile @@ -468,7 +468,7 @@ run-e2e-image: ls -liah $(abspath ./tests/e2e-fixtures) docker run -p 5390:8080 -e MODELS_PATH=/models -e THREADS=1 -e DEBUG=true -d --rm -v $(TEST_DIR):/models --gpus all --name e2e-tests-$(RANDOM) localai-tests -run-e2e-aio: +run-e2e-aio: protogen-go @echo 'Running e2e AIO tests' $(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --flake-attempts 5 -v -r ./tests/e2e-aio diff --git a/pkg/model/initializers.go b/pkg/model/initializers.go index 80dd10b4..d0f47373 100644 --- a/pkg/model/initializers.go +++ b/pkg/model/initializers.go @@ -304,18 +304,19 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string return nil, fmt.Errorf("failed allocating free ports: %s", err.Error()) } // Make sure the process is executable - if err := ml.startProcess(uri, o.model, serverAddress); err != nil { + process, err := ml.startProcess(uri, o.model, serverAddress) + if err != nil { log.Error().Err(err).Str("path", uri).Msg("failed to launch ") return nil, err } log.Debug().Msgf("GRPC Service Started") - client = NewModel(modelName, serverAddress) + client = NewModel(modelName, serverAddress, process) } else { log.Debug().Msg("external backend is uri") // address - client = NewModel(modelName, uri) + client = NewModel(modelName, uri, nil) } } else { grpcProcess := backendPath(o.assetDir, backend) @@ -346,13 +347,14 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string args, grpcProcess = library.LoadLDSO(o.assetDir, args, grpcProcess) // Make sure the process is executable in any circumstance - if err := ml.startProcess(grpcProcess, o.model, serverAddress, args...); err != nil { + process, err := ml.startProcess(grpcProcess, o.model, serverAddress, args...) + if err != nil { return nil, err } log.Debug().Msgf("GRPC Service Started") - client = NewModel(modelName, serverAddress) + client = NewModel(modelName, serverAddress, process) } log.Debug().Msgf("Wait for the service to start up") @@ -374,6 +376,7 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string if !ready { log.Debug().Msgf("GRPC Service NOT ready") + ml.deleteProcess(o.model) return nil, fmt.Errorf("grpc service not ready") } @@ -385,9 +388,11 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string res, err := client.GRPC(o.parallelRequests, ml.wd).LoadModel(o.context, &options) if err != nil { + ml.deleteProcess(o.model) return nil, fmt.Errorf("could not load model: %w", err) } if !res.Success { + ml.deleteProcess(o.model) return nil, fmt.Errorf("could not load model (no success): %s", res.Message) } diff --git a/pkg/model/loader.go b/pkg/model/loader.go index 4f1ec841..68ac1a31 100644 --- a/pkg/model/loader.go +++ b/pkg/model/loader.go @@ -13,7 +13,6 @@ import ( "github.com/mudler/LocalAI/pkg/utils" - process "github.com/mudler/go-processmanager" "github.com/rs/zerolog/log" ) @@ -21,20 +20,18 @@ import ( // TODO: Split ModelLoader and TemplateLoader? Just to keep things more organized. Left together to share a mutex until I look into that. Would split if we seperate directories for .bin/.yaml and .tmpl type ModelLoader struct { - ModelPath string - mu sync.Mutex - models map[string]*Model - grpcProcesses map[string]*process.Process - templates *templates.TemplateCache - wd *WatchDog + ModelPath string + mu sync.Mutex + models map[string]*Model + templates *templates.TemplateCache + wd *WatchDog } func NewModelLoader(modelPath string) *ModelLoader { nml := &ModelLoader{ - ModelPath: modelPath, - models: make(map[string]*Model), - templates: templates.NewTemplateCache(modelPath), - grpcProcesses: make(map[string]*process.Process), + ModelPath: modelPath, + models: make(map[string]*Model), + templates: templates.NewTemplateCache(modelPath), } return nml @@ -127,6 +124,8 @@ func (ml *ModelLoader) LoadModel(modelName string, loader func(string, string) ( modelFile := filepath.Join(ml.ModelPath, modelName) log.Debug().Msgf("Loading model in memory from file: %s", modelFile) + ml.mu.Lock() + defer ml.mu.Unlock() model, err := loader(modelName, modelFile) if err != nil { return nil, err @@ -136,8 +135,6 @@ func (ml *ModelLoader) LoadModel(modelName string, loader func(string, string) ( return nil, fmt.Errorf("loader didn't return a model") } - ml.mu.Lock() - defer ml.mu.Unlock() ml.models[modelName] = model return model, nil @@ -146,14 +143,13 @@ func (ml *ModelLoader) LoadModel(modelName string, loader func(string, string) ( func (ml *ModelLoader) ShutdownModel(modelName string) error { ml.mu.Lock() defer ml.mu.Unlock() - - _, ok := ml.models[modelName] + model, ok := ml.models[modelName] if !ok { return fmt.Errorf("model %s not found", modelName) } retries := 1 - for ml.models[modelName].GRPC(false, ml.wd).IsBusy() { + for model.GRPC(false, ml.wd).IsBusy() { log.Debug().Msgf("%s busy. Waiting.", modelName) dur := time.Duration(retries*2) * time.Second if dur > retryTimeout { @@ -185,8 +181,8 @@ func (ml *ModelLoader) CheckIsLoaded(s string) *Model { if !alive { log.Warn().Msgf("GRPC Model not responding: %s", err.Error()) log.Warn().Msgf("Deleting the process in order to recreate it") - process, exists := ml.grpcProcesses[s] - if !exists { + process := m.Process() + if process == nil { log.Error().Msgf("Process not found for '%s' and the model is not responding anymore !", s) return m } diff --git a/pkg/model/loader_test.go b/pkg/model/loader_test.go index c16a6e50..d0ad4e0c 100644 --- a/pkg/model/loader_test.go +++ b/pkg/model/loader_test.go @@ -63,7 +63,7 @@ var _ = Describe("ModelLoader", func() { Context("LoadModel", func() { It("should load a model and keep it in memory", func() { - mockModel = model.NewModel("foo", "test.model") + mockModel = model.NewModel("foo", "test.model", nil) mockLoader := func(modelName, modelFile string) (*model.Model, error) { return mockModel, nil @@ -88,7 +88,7 @@ var _ = Describe("ModelLoader", func() { Context("ShutdownModel", func() { It("should shutdown a loaded model", func() { - mockModel = model.NewModel("foo", "test.model") + mockModel = model.NewModel("foo", "test.model", nil) mockLoader := func(modelName, modelFile string) (*model.Model, error) { return mockModel, nil diff --git a/pkg/model/model.go b/pkg/model/model.go index 6cb81d10..6e4fd316 100644 --- a/pkg/model/model.go +++ b/pkg/model/model.go @@ -1,20 +1,32 @@ package model -import grpc "github.com/mudler/LocalAI/pkg/grpc" +import ( + "sync" + + grpc "github.com/mudler/LocalAI/pkg/grpc" + process "github.com/mudler/go-processmanager" +) type Model struct { ID string `json:"id"` address string client grpc.Backend + process *process.Process + sync.Mutex } -func NewModel(ID, address string) *Model { +func NewModel(ID, address string, process *process.Process) *Model { return &Model{ ID: ID, address: address, + process: process, } } +func (m *Model) Process() *process.Process { + return m.process +} + func (m *Model) GRPC(parallel bool, wd *WatchDog) grpc.Backend { if m.client != nil { return m.client @@ -25,6 +37,8 @@ func (m *Model) GRPC(parallel bool, wd *WatchDog) grpc.Backend { enableWD = true } + m.Lock() + defer m.Unlock() m.client = grpc.NewClient(m.address, parallel, wd, enableWD) return m.client } diff --git a/pkg/model/process.go b/pkg/model/process.go index bcd1fccb..48631d79 100644 --- a/pkg/model/process.go +++ b/pkg/model/process.go @@ -16,20 +16,22 @@ import ( ) func (ml *ModelLoader) deleteProcess(s string) error { - if _, exists := ml.grpcProcesses[s]; exists { - if err := ml.grpcProcesses[s].Stop(); err != nil { - log.Error().Err(err).Msgf("(deleteProcess) error while deleting grpc process %s", s) + if m, exists := ml.models[s]; exists { + process := m.Process() + if process != nil { + if err := process.Stop(); err != nil { + log.Error().Err(err).Msgf("(deleteProcess) error while deleting process %s", s) + } } } - delete(ml.grpcProcesses, s) delete(ml.models, s) return nil } func (ml *ModelLoader) StopGRPC(filter GRPCProcessFilter) error { var err error = nil - for k, p := range ml.grpcProcesses { - if filter(k, p) { + for k, m := range ml.models { + if filter(k, m.Process()) { e := ml.ShutdownModel(k) err = errors.Join(err, e) } @@ -44,17 +46,20 @@ func (ml *ModelLoader) StopAllGRPC() error { func (ml *ModelLoader) GetGRPCPID(id string) (int, error) { ml.mu.Lock() defer ml.mu.Unlock() - p, exists := ml.grpcProcesses[id] + p, exists := ml.models[id] if !exists { return -1, fmt.Errorf("no grpc backend found for %s", id) } - return strconv.Atoi(p.PID) + if p.Process() == nil { + return -1, fmt.Errorf("no grpc backend found for %s", id) + } + return strconv.Atoi(p.Process().PID) } -func (ml *ModelLoader) startProcess(grpcProcess, id string, serverAddress string, args ...string) error { +func (ml *ModelLoader) startProcess(grpcProcess, id string, serverAddress string, args ...string) (*process.Process, error) { // Make sure the process is executable if err := os.Chmod(grpcProcess, 0700); err != nil { - return err + return nil, err } log.Debug().Msgf("Loading GRPC Process: %s", grpcProcess) @@ -63,7 +68,7 @@ func (ml *ModelLoader) startProcess(grpcProcess, id string, serverAddress string workDir, err := filepath.Abs(filepath.Dir(grpcProcess)) if err != nil { - return err + return nil, err } grpcControlProcess := process.New( @@ -79,10 +84,8 @@ func (ml *ModelLoader) startProcess(grpcProcess, id string, serverAddress string ml.wd.AddAddressModelMap(serverAddress, id) } - ml.grpcProcesses[id] = grpcControlProcess - if err := grpcControlProcess.Run(); err != nil { - return err + return grpcControlProcess, err } log.Debug().Msgf("GRPC Service state dir: %s", grpcControlProcess.StateDir()) @@ -116,5 +119,5 @@ func (ml *ModelLoader) startProcess(grpcProcess, id string, serverAddress string } }() - return nil + return grpcControlProcess, nil } From b0f4556c0f4277fc4056c396e4c639f7b41ea952 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Thu, 26 Sep 2024 14:52:26 +0200 Subject: [PATCH 094/122] chore: :arrow_up: Update ggerganov/llama.cpp to `ea9c32be71b91b42ecc538bd902e93cbb5fb36cb` (#3667) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 4efee986..3a90463b 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=70392f1f81470607ba3afef04aa56c9f65587664 +CPPLLAMA_VERSION?=ea9c32be71b91b42ecc538bd902e93cbb5fb36cb # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From 8c4196faf34a123f018471890873403bec33b702 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Thu, 26 Sep 2024 15:58:17 +0200 Subject: [PATCH 095/122] chore: :arrow_up: Update ggerganov/whisper.cpp to `69339af2d104802f3f201fd419163defba52890e` (#3666) :arrow_up: Update ggerganov/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 3a90463b..07fd6ee3 100644 --- a/Makefile +++ b/Makefile @@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6 # whisper.cpp version WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp -WHISPER_CPP_VERSION?=0d2e2aed80109e8696791083bde3b58e190b7812 +WHISPER_CPP_VERSION?=69339af2d104802f3f201fd419163defba52890e # bert.cpp version BERT_REPO?=https://github.com/go-skynet/go-bert.cpp From f2ba1cfb01d738d61dd443589d2878d4643e4fe2 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Thu, 26 Sep 2024 23:41:45 +0200 Subject: [PATCH 096/122] chore: :arrow_up: Update ggerganov/llama.cpp to `95bc82fbc0df6d48cf66c857a4dda3d044f45ca2` (#3674) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 07fd6ee3..ab7532d3 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=ea9c32be71b91b42ecc538bd902e93cbb5fb36cb +CPPLLAMA_VERSION?=95bc82fbc0df6d48cf66c857a4dda3d044f45ca2 # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From 4550abbfcece4f1ae4e2162431e6cd772d7a92d4 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Fri, 27 Sep 2024 08:54:36 +0200 Subject: [PATCH 097/122] chore(model-gallery): :arrow_up: update checksum (#3675) :arrow_up: Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- gallery/index.yaml | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index de38c3d5..4b668061 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -59,8 +59,8 @@ - filename: llama-3.2-1b-instruct-q8_0.gguf sha256: ba345c83bf5cc679c653b853c46517eea5a34f03ed2205449db77184d9ae62a9 uri: huggingface://hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF/llama-3.2-1b-instruct-q8_0.gguf -## Qwen2.5 - &qwen25 + ## Qwen2.5 name: "qwen2.5-14b-instruct" url: "github:mudler/LocalAI/gallery/chatml.yaml@master" license: apache-2.0 @@ -89,11 +89,11 @@ - https://huggingface.co/bartowski/Qwen2.5-Math-7B-Instruct-GGUF - https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct description: | - In August 2024, we released the first series of mathematical LLMs - Qwen2-Math - of our Qwen family. A month later, we have upgraded it and open-sourced Qwen2.5-Math series, including base models Qwen2.5-Math-1.5B/7B/72B, instruction-tuned models Qwen2.5-Math-1.5B/7B/72B-Instruct, and mathematical reward model Qwen2.5-Math-RM-72B. + In August 2024, we released the first series of mathematical LLMs - Qwen2-Math - of our Qwen family. A month later, we have upgraded it and open-sourced Qwen2.5-Math series, including base models Qwen2.5-Math-1.5B/7B/72B, instruction-tuned models Qwen2.5-Math-1.5B/7B/72B-Instruct, and mathematical reward model Qwen2.5-Math-RM-72B. - Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT. + Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT. - The base models of Qwen2-Math are initialized with Qwen2-1.5B/7B/72B, and then pretrained on a meticulously designed Mathematics-specific Corpus. This corpus contains large-scale high-quality mathematical web texts, books, codes, exam questions, and mathematical pre-training data synthesized by Qwen2. + The base models of Qwen2-Math are initialized with Qwen2-1.5B/7B/72B, and then pretrained on a meticulously designed Mathematics-specific Corpus. This corpus contains large-scale high-quality mathematical web texts, books, codes, exam questions, and mathematical pre-training data synthesized by Qwen2. overrides: parameters: model: Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf @@ -195,8 +195,8 @@ model: Qwen2.5-32B.Q4_K_M.gguf files: - filename: Qwen2.5-32B.Q4_K_M.gguf - sha256: 02703e27c8b964db445444581a6937ad7538f0c32a100b26b49fa0e8ff527155 uri: huggingface://mradermacher/Qwen2.5-32B-GGUF/Qwen2.5-32B.Q4_K_M.gguf + sha256: fa42a4067e3630929202b6bb1ef5cebc43c1898494aedfd567b7d53c7a9d84a6 - !!merge <<: *qwen25 name: "qwen2.5-32b-instruct" urls: @@ -221,8 +221,8 @@ - filename: Qwen2.5-72B-Instruct-Q4_K_M.gguf sha256: e4c8fad16946be8cf0bbf67eb8f4e18fc7415a5a6d2854b4cda453edb4082545 uri: huggingface://bartowski/Qwen2.5-72B-Instruct-GGUF/Qwen2.5-72B-Instruct-Q4_K_M.gguf -## SmolLM - &smollm + ## SmolLM url: "github:mudler/LocalAI/gallery/chatml.yaml@master" name: "smollm-1.7b-instruct" icon: https://huggingface.co/datasets/HuggingFaceTB/images/resolve/main/banner_smol.png @@ -651,9 +651,9 @@ - https://huggingface.co/leafspark/Reflection-Llama-3.1-70B-bf16 - https://huggingface.co/senseable/Reflection-Llama-3.1-70B-gguf description: | - Reflection Llama-3.1 70B is (currently) the world's top open-source LLM, trained with a new technique called Reflection-Tuning that teaches a LLM to detect mistakes in its reasoning and correct course. + Reflection Llama-3.1 70B is (currently) the world's top open-source LLM, trained with a new technique called Reflection-Tuning that teaches a LLM to detect mistakes in its reasoning and correct course. - The model was trained on synthetic data generated by Glaive. If you're training a model, Glaive is incredible — use them. + The model was trained on synthetic data generated by Glaive. If you're training a model, Glaive is incredible — use them. overrides: parameters: model: Reflection-Llama-3.1-70B-q4_k_m.gguf @@ -973,15 +973,15 @@ - https://huggingface.co/Sao10K/L3.1-8B-Niitama-v1.1 - https://huggingface.co/Lewdiculous/L3.1-8B-Niitama-v1.1-GGUF-IQ-Imatrix description: | - GGUF-IQ-Imatrix quants for Sao10K/L3.1-8B-Niitama-v1.1 - Here's the subjectively superior L3 version: L3-8B-Niitama-v1 - An experimental model using experimental methods. + GGUF-IQ-Imatrix quants for Sao10K/L3.1-8B-Niitama-v1.1 + Here's the subjectively superior L3 version: L3-8B-Niitama-v1 + An experimental model using experimental methods. - More detail on it: + More detail on it: - Tamamo and Niitama are made from the same data. Literally. The only thing that's changed is how theyre shuffled and formatted. Yet, I get wildly different results. + Tamamo and Niitama are made from the same data. Literally. The only thing that's changed is how theyre shuffled and formatted. Yet, I get wildly different results. - Interesting, eh? Feels kinda not as good compared to the l3 version, but it's aight. + Interesting, eh? Feels kinda not as good compared to the l3 version, but it's aight. overrides: parameters: model: L3.1-8B-Niitama-v1.1-Q4_K_M-imat.gguf @@ -1606,8 +1606,8 @@ urls: - https://huggingface.co/Lewdiculous/MN-12B-Lyra-v4-GGUF-IQ-Imatrix description: | - A finetune of Mistral Nemo by Sao10K. - Uses the ChatML prompt format. + A finetune of Mistral Nemo by Sao10K. + Uses the ChatML prompt format. overrides: parameters: model: MN-12B-Lyra-v4-Q4_K_M-imat.gguf @@ -2134,7 +2134,7 @@ - https://huggingface.co/EpistemeAI/Athena-codegemma-2-2b-it - https://huggingface.co/mradermacher/Athena-codegemma-2-2b-it-GGUF description: | - Supervised fine tuned (sft unsloth) for coding with EpistemeAI coding dataset. + Supervised fine tuned (sft unsloth) for coding with EpistemeAI coding dataset. overrides: parameters: model: Athena-codegemma-2-2b-it.Q4_K_M.gguf From 453c45d022c7f211279f3d30cf519520636dd7be Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 27 Sep 2024 12:21:04 +0200 Subject: [PATCH 098/122] models(gallery): add magnusintellectus-12b-v1-i1 (#3678) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 4b668061..1a1828f6 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1615,6 +1615,27 @@ - filename: MN-12B-Lyra-v4-Q4_K_M-imat.gguf sha256: 1989123481ca1936c8a2cbe278ff5d1d2b0ae63dbdc838bb36a6d7547b8087b3 uri: huggingface://Lewdiculous/MN-12B-Lyra-v4-GGUF-IQ-Imatrix/MN-12B-Lyra-v4-Q4_K_M-imat.gguf +- !!merge <<: *mistral03 + name: "magnusintellectus-12b-v1-i1" + url: "github:mudler/LocalAI/gallery/chatml.yaml@master" + icon: https://cdn-uploads.huggingface.co/production/uploads/66b564058d9afb7a9d5607d5/hUVJI1Qa4tCMrZWMgYkoD.png + urls: + - https://huggingface.co/GalrionSoftworks/MagnusIntellectus-12B-v1 + - https://huggingface.co/mradermacher/MagnusIntellectus-12B-v1-i1-GGUF + description: | + How pleasant, the rocks appear to have made a decent conglomerate. A-. + + MagnusIntellectus is a merge of the following models using LazyMergekit: + + UsernameJustAnother/Nemo-12B-Marlin-v5 + anthracite-org/magnum-12b-v2 + overrides: + parameters: + model: MagnusIntellectus-12B-v1.i1-Q4_K_M.gguf + files: + - filename: MagnusIntellectus-12B-v1.i1-Q4_K_M.gguf + sha256: c97107983b4edc5b6f2a592d227ca2dd4196e2af3d3bc0fe6b7a8954a1fb5870 + uri: huggingface://mradermacher/MagnusIntellectus-12B-v1-i1-GGUF/MagnusIntellectus-12B-v1.i1-Q4_K_M.gguf - &mudler ### START mudler's LocalAI specific-models url: "github:mudler/LocalAI/gallery/mudler.yaml@master" From 2a8cbad12222f59295911078e9acc3788e666f36 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 27 Sep 2024 13:03:41 +0200 Subject: [PATCH 099/122] models(gallery): add bigqwen2.5-52b-instruct (#3679) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 1a1828f6..847e004c 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -221,6 +221,22 @@ - filename: Qwen2.5-72B-Instruct-Q4_K_M.gguf sha256: e4c8fad16946be8cf0bbf67eb8f4e18fc7415a5a6d2854b4cda453edb4082545 uri: huggingface://bartowski/Qwen2.5-72B-Instruct-GGUF/Qwen2.5-72B-Instruct-Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "bigqwen2.5-52b-instruct" + icon: https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/98GiKtmH1AtHHbIbOUH4Y.jpeg + urls: + - https://huggingface.co/mlabonne/BigQwen2.5-52B-Instruct + - https://huggingface.co/bartowski/BigQwen2.5-52B-Instruct-GGUF + description: | + BigQwen2.5-52B-Instruct is a Qwen/Qwen2-32B-Instruct self-merge made with MergeKit. + It applies the mlabonne/Meta-Llama-3-120B-Instruct recipe. + overrides: + parameters: + model: BigQwen2.5-52B-Instruct-Q4_K_M.gguf + files: + - filename: BigQwen2.5-52B-Instruct-Q4_K_M.gguf + sha256: 9c939f08e366b51b07096eb2ecb5cc2a82894ac7baf639e446237ad39889c896 + uri: huggingface://bartowski/BigQwen2.5-52B-Instruct-GGUF/BigQwen2.5-52B-Instruct-Q4_K_M.gguf - &smollm ## SmolLM url: "github:mudler/LocalAI/gallery/chatml.yaml@master" From 4e0f3cc9802e56fae2a52715298257932e3c0f5e Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 28 Sep 2024 00:42:59 +0200 Subject: [PATCH 100/122] chore: :arrow_up: Update ggerganov/llama.cpp to `b5de3b74a595cbfefab7eeb5a567425c6a9690cf` (#3681) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index ab7532d3..2c7310d8 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=95bc82fbc0df6d48cf66c857a4dda3d044f45ca2 +CPPLLAMA_VERSION?=b5de3b74a595cbfefab7eeb5a567425c6a9690cf # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From e94a50e9db24aa03ce0d53a5200099aadb52b3aa Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 28 Sep 2024 10:02:19 +0200 Subject: [PATCH 101/122] chore: :arrow_up: Update ggerganov/whisper.cpp to `8feb375fbdf0277ad36958c218c6bf48fa0ba75a` (#3680) :arrow_up: Update ggerganov/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 2c7310d8..aa926f4c 100644 --- a/Makefile +++ b/Makefile @@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6 # whisper.cpp version WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp -WHISPER_CPP_VERSION?=69339af2d104802f3f201fd419163defba52890e +WHISPER_CPP_VERSION?=8feb375fbdf0277ad36958c218c6bf48fa0ba75a # bert.cpp version BERT_REPO?=https://github.com/go-skynet/go-bert.cpp From 50a3b54e3474fd552b352222a90f70c8ab624ceb Mon Sep 17 00:00:00 2001 From: siddimore Date: Sat, 28 Sep 2024 08:23:56 -0700 Subject: [PATCH 102/122] feat(api): add correlationID to Track Chat requests (#3668) * Add CorrelationID to chat request Signed-off-by: Siddharth More * remove get_token_metrics Signed-off-by: Siddharth More * Add CorrelationID to proto Signed-off-by: Siddharth More * fix correlation method name Signed-off-by: Siddharth More * Update core/http/endpoints/openai/chat.go Co-authored-by: Ettore Di Giacinto Signed-off-by: Siddharth More * Update core/http/endpoints/openai/chat.go Signed-off-by: Ettore Di Giacinto Signed-off-by: Siddharth More --------- Signed-off-by: Siddharth More Signed-off-by: Ettore Di Giacinto Co-authored-by: Ettore Di Giacinto --- backend/backend.proto | 1 + backend/cpp/llama/grpc-server.cpp | 14 ++++++++++++++ core/http/endpoints/openai/chat.go | 7 +++++++ core/http/endpoints/openai/completion.go | 2 ++ core/http/endpoints/openai/request.go | 13 ++++++++++++- 5 files changed, 36 insertions(+), 1 deletion(-) diff --git a/backend/backend.proto b/backend/backend.proto index 31bd63e5..b2d4518e 100644 --- a/backend/backend.proto +++ b/backend/backend.proto @@ -136,6 +136,7 @@ message PredictOptions { repeated Message Messages = 44; repeated string Videos = 45; repeated string Audios = 46; + string CorrelationId = 47; } // The response message containing the result diff --git a/backend/cpp/llama/grpc-server.cpp b/backend/cpp/llama/grpc-server.cpp index 56d59d21..791612db 100644 --- a/backend/cpp/llama/grpc-server.cpp +++ b/backend/cpp/llama/grpc-server.cpp @@ -2106,6 +2106,9 @@ json parse_options(bool streaming, const backend::PredictOptions* predict, llama data["ignore_eos"] = predict->ignoreeos(); data["embeddings"] = predict->embeddings(); + // Add the correlationid to json data + data["correlation_id"] = predict->correlationid(); + // for each image in the request, add the image data // for (int i = 0; i < predict->images_size(); i++) { @@ -2344,6 +2347,11 @@ public: int32_t tokens_evaluated = result.result_json.value("tokens_evaluated", 0); reply.set_prompt_tokens(tokens_evaluated); + // Log Request Correlation Id + LOG_VERBOSE("correlation:", { + { "id", data["correlation_id"] } + }); + // Send the reply writer->Write(reply); @@ -2367,6 +2375,12 @@ public: std::string completion_text; task_result result = llama.queue_results.recv(task_id); if (!result.error && result.stop) { + + // Log Request Correlation Id + LOG_VERBOSE("correlation:", { + { "id", data["correlation_id"] } + }); + completion_text = result.result_json.value("content", ""); int32_t tokens_predicted = result.result_json.value("tokens_predicted", 0); int32_t tokens_evaluated = result.result_json.value("tokens_evaluated", 0); diff --git a/core/http/endpoints/openai/chat.go b/core/http/endpoints/openai/chat.go index b937120a..1ac1387e 100644 --- a/core/http/endpoints/openai/chat.go +++ b/core/http/endpoints/openai/chat.go @@ -161,6 +161,12 @@ func ChatEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, startup textContentToReturn = "" id = uuid.New().String() created = int(time.Now().Unix()) + // Set CorrelationID + correlationID := c.Get("X-Correlation-ID") + if len(strings.TrimSpace(correlationID)) == 0 { + correlationID = id + } + c.Set("X-Correlation-ID", correlationID) modelFile, input, err := readRequest(c, cl, ml, startupOptions, true) if err != nil { @@ -444,6 +450,7 @@ func ChatEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, startup c.Set("Cache-Control", "no-cache") c.Set("Connection", "keep-alive") c.Set("Transfer-Encoding", "chunked") + c.Set("X-Correlation-ID", id) responses := make(chan schema.OpenAIResponse) diff --git a/core/http/endpoints/openai/completion.go b/core/http/endpoints/openai/completion.go index b087cc5f..e5de1b3f 100644 --- a/core/http/endpoints/openai/completion.go +++ b/core/http/endpoints/openai/completion.go @@ -57,6 +57,8 @@ func CompletionEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, a } return func(c *fiber.Ctx) error { + // Add Correlation + c.Set("X-Correlation-ID", id) modelFile, input, err := readRequest(c, cl, ml, appConfig, true) if err != nil { return fmt.Errorf("failed reading parameters from request:%w", err) diff --git a/core/http/endpoints/openai/request.go b/core/http/endpoints/openai/request.go index e24dd28f..d6182a39 100644 --- a/core/http/endpoints/openai/request.go +++ b/core/http/endpoints/openai/request.go @@ -6,6 +6,7 @@ import ( "fmt" "github.com/gofiber/fiber/v2" + "github.com/google/uuid" "github.com/mudler/LocalAI/core/config" fiberContext "github.com/mudler/LocalAI/core/http/ctx" "github.com/mudler/LocalAI/core/schema" @@ -15,6 +16,11 @@ import ( "github.com/rs/zerolog/log" ) +type correlationIDKeyType string + +// CorrelationIDKey to track request across process boundary +const CorrelationIDKey correlationIDKeyType = "correlationID" + func readRequest(c *fiber.Ctx, cl *config.BackendConfigLoader, ml *model.ModelLoader, o *config.ApplicationConfig, firstModel bool) (string, *schema.OpenAIRequest, error) { input := new(schema.OpenAIRequest) @@ -24,9 +30,14 @@ func readRequest(c *fiber.Ctx, cl *config.BackendConfigLoader, ml *model.ModelLo } received, _ := json.Marshal(input) + // Extract or generate the correlation ID + correlationID := c.Get("X-Correlation-ID", uuid.New().String()) ctx, cancel := context.WithCancel(o.Context) - input.Context = ctx + // Add the correlation ID to the new context + ctxWithCorrelationID := context.WithValue(ctx, CorrelationIDKey, correlationID) + + input.Context = ctxWithCorrelationID input.Cancel = cancel log.Debug().Msgf("Request received: %s", string(received)) From 1689740269ef97e9778d75e78ae4d844520a113c Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 29 Sep 2024 20:39:39 +0200 Subject: [PATCH 103/122] models(gallery): add replete-llm-v2.5-qwen-14b (#3688) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 847e004c..7701efd5 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -237,6 +237,23 @@ - filename: BigQwen2.5-52B-Instruct-Q4_K_M.gguf sha256: 9c939f08e366b51b07096eb2ecb5cc2a82894ac7baf639e446237ad39889c896 uri: huggingface://bartowski/BigQwen2.5-52B-Instruct-GGUF/BigQwen2.5-52B-Instruct-Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "replete-llm-v2.5-qwen-14b" + icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/ihnWXDEgV-ZKN_B036U1J.png + urls: + - https://huggingface.co/Replete-AI/Replete-LLM-V2.5-Qwen-14b + - https://huggingface.co/bartowski/Replete-LLM-V2.5-Qwen-14b-GGUF + description: | + Replete-LLM-V2.5-Qwen-14b is a continues finetuned version of Qwen2.5-14B. I noticed recently that the Qwen team did not learn from my methods of continuous finetuning, the great benefits, and no downsides of it. So I took it upon myself to merge the instruct model with the base model myself using the Ties merge method + + This version of the model shows higher performance than the original instruct and base models. + overrides: + parameters: + model: Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf + files: + - filename: Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf + sha256: 17d0792ff5e3062aecb965629f66e679ceb407e4542e8045993dcfe9e7e14d9d + uri: huggingface://bartowski/Replete-LLM-V2.5-Qwen-14b-GGUF/Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf - &smollm ## SmolLM url: "github:mudler/LocalAI/gallery/chatml.yaml@master" From ad62156d548adaade746e9d702301da2c793d0b9 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 29 Sep 2024 22:47:26 +0200 Subject: [PATCH 104/122] models(gallery): add replete-llm-v2.5-qwen-7b (#3689) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 7701efd5..2ffbd05b 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -254,6 +254,24 @@ - filename: Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf sha256: 17d0792ff5e3062aecb965629f66e679ceb407e4542e8045993dcfe9e7e14d9d uri: huggingface://bartowski/Replete-LLM-V2.5-Qwen-14b-GGUF/Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "replete-llm-v2.5-qwen-7b" + icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/ihnWXDEgV-ZKN_B036U1J.png + urls: + - https://huggingface.co/Replete-AI/Replete-LLM-V2.5-Qwen-7b + - https://huggingface.co/bartowski/Replete-LLM-V2.5-Qwen-7b-GGUF + description: | + Replete-LLM-V2.5-Qwen-7b is a continues finetuned version of Qwen2.5-14B. I noticed recently that the Qwen team did not learn from my methods of continuous finetuning, the great benefits, and no downsides of it. So I took it upon myself to merge the instruct model with the base model myself using the Ties merge method + + This version of the model shows higher performance than the original instruct and base models. + overrides: + parameters: + model: Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf + files: + - filename: Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf + sha256: 054d54972259c0398b4e0af3f408f608e1166837b1d7535d08fc440d1daf8639 + uri: huggingface://bartowski/Replete-LLM-V2.5-Qwen-7b-GGUF/Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf + - &smollm ## SmolLM url: "github:mudler/LocalAI/gallery/chatml.yaml@master" From 6dfee995754fb3853e02d69c370c670d636f4294 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Mon, 30 Sep 2024 09:09:18 +0200 Subject: [PATCH 105/122] chore: :arrow_up: Update ggerganov/llama.cpp to `c919d5db39c8a7fcb64737f008e4b105ee0acd20` (#3686) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index aa926f4c..8617363c 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=b5de3b74a595cbfefab7eeb5a567425c6a9690cf +CPPLLAMA_VERSION?=c919d5db39c8a7fcb64737f008e4b105ee0acd20 # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From 078942fc9f741a35a189f295d1b4fb4ed1e26400 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 30 Sep 2024 09:09:51 +0200 Subject: [PATCH 106/122] chore(deps): bump grpcio to 1.66.2 (#3690) Signed-off-by: Ettore Di Giacinto --- backend/python/autogptq/requirements.txt | 2 +- backend/python/bark/requirements.txt | 2 +- backend/python/common/template/requirements.txt | 2 +- backend/python/coqui/requirements.txt | 2 +- backend/python/diffusers/requirements.txt | 2 +- backend/python/exllama2/requirements.txt | 2 +- backend/python/mamba/requirements.txt | 2 +- backend/python/openvoice/requirements-intel.txt | 2 +- backend/python/openvoice/requirements.txt | 2 +- backend/python/parler-tts/requirements.txt | 2 +- backend/python/rerankers/requirements.txt | 2 +- backend/python/sentencetransformers/requirements.txt | 2 +- backend/python/transformers-musicgen/requirements.txt | 2 +- backend/python/transformers/requirements.txt | 2 +- backend/python/vall-e-x/requirements.txt | 2 +- backend/python/vllm/requirements.txt | 2 +- 16 files changed, 16 insertions(+), 16 deletions(-) diff --git a/backend/python/autogptq/requirements.txt b/backend/python/autogptq/requirements.txt index 150fcc1b..9cb6ce94 100644 --- a/backend/python/autogptq/requirements.txt +++ b/backend/python/autogptq/requirements.txt @@ -1,6 +1,6 @@ accelerate auto-gptq==0.7.1 -grpcio==1.66.1 +grpcio==1.66.2 protobuf certifi transformers \ No newline at end of file diff --git a/backend/python/bark/requirements.txt b/backend/python/bark/requirements.txt index 6404b98e..6e46924a 100644 --- a/backend/python/bark/requirements.txt +++ b/backend/python/bark/requirements.txt @@ -1,4 +1,4 @@ bark==0.1.5 -grpcio==1.66.1 +grpcio==1.66.2 protobuf certifi \ No newline at end of file diff --git a/backend/python/common/template/requirements.txt b/backend/python/common/template/requirements.txt index 21610c1c..540c0eb5 100644 --- a/backend/python/common/template/requirements.txt +++ b/backend/python/common/template/requirements.txt @@ -1,2 +1,2 @@ -grpcio==1.66.1 +grpcio==1.66.2 protobuf \ No newline at end of file diff --git a/backend/python/coqui/requirements.txt b/backend/python/coqui/requirements.txt index 2a91f2b9..29484f7d 100644 --- a/backend/python/coqui/requirements.txt +++ b/backend/python/coqui/requirements.txt @@ -1,4 +1,4 @@ coqui-tts -grpcio==1.66.1 +grpcio==1.66.2 protobuf certifi \ No newline at end of file diff --git a/backend/python/diffusers/requirements.txt b/backend/python/diffusers/requirements.txt index 043c7aba..730e316f 100644 --- a/backend/python/diffusers/requirements.txt +++ b/backend/python/diffusers/requirements.txt @@ -1,5 +1,5 @@ setuptools -grpcio==1.66.1 +grpcio==1.66.2 pillow protobuf certifi diff --git a/backend/python/exllama2/requirements.txt b/backend/python/exllama2/requirements.txt index 6fb018a0..e3db2b2f 100644 --- a/backend/python/exllama2/requirements.txt +++ b/backend/python/exllama2/requirements.txt @@ -1,4 +1,4 @@ -grpcio==1.66.1 +grpcio==1.66.2 protobuf certifi wheel diff --git a/backend/python/mamba/requirements.txt b/backend/python/mamba/requirements.txt index 8e1b0195..83ae4279 100644 --- a/backend/python/mamba/requirements.txt +++ b/backend/python/mamba/requirements.txt @@ -1,3 +1,3 @@ -grpcio==1.66.1 +grpcio==1.66.2 protobuf certifi \ No newline at end of file diff --git a/backend/python/openvoice/requirements-intel.txt b/backend/python/openvoice/requirements-intel.txt index a9a4cc20..c568dab1 100644 --- a/backend/python/openvoice/requirements-intel.txt +++ b/backend/python/openvoice/requirements-intel.txt @@ -2,7 +2,7 @@ intel-extension-for-pytorch torch optimum[openvino] -grpcio==1.66.1 +grpcio==1.66.2 protobuf librosa==0.9.1 faster-whisper==1.0.3 diff --git a/backend/python/openvoice/requirements.txt b/backend/python/openvoice/requirements.txt index b38805be..6ee29ce4 100644 --- a/backend/python/openvoice/requirements.txt +++ b/backend/python/openvoice/requirements.txt @@ -1,4 +1,4 @@ -grpcio==1.66.1 +grpcio==1.66.2 protobuf librosa faster-whisper diff --git a/backend/python/parler-tts/requirements.txt b/backend/python/parler-tts/requirements.txt index 0da3da13..d7f36feb 100644 --- a/backend/python/parler-tts/requirements.txt +++ b/backend/python/parler-tts/requirements.txt @@ -1,4 +1,4 @@ -grpcio==1.66.1 +grpcio==1.66.2 protobuf certifi llvmlite==0.43.0 \ No newline at end of file diff --git a/backend/python/rerankers/requirements.txt b/backend/python/rerankers/requirements.txt index 8e1b0195..83ae4279 100644 --- a/backend/python/rerankers/requirements.txt +++ b/backend/python/rerankers/requirements.txt @@ -1,3 +1,3 @@ -grpcio==1.66.1 +grpcio==1.66.2 protobuf certifi \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements.txt b/backend/python/sentencetransformers/requirements.txt index b9cb6061..40a387f1 100644 --- a/backend/python/sentencetransformers/requirements.txt +++ b/backend/python/sentencetransformers/requirements.txt @@ -1,4 +1,4 @@ -grpcio==1.66.1 +grpcio==1.66.2 protobuf certifi datasets diff --git a/backend/python/transformers-musicgen/requirements.txt b/backend/python/transformers-musicgen/requirements.txt index fb1119a9..a3f66651 100644 --- a/backend/python/transformers-musicgen/requirements.txt +++ b/backend/python/transformers-musicgen/requirements.txt @@ -1,4 +1,4 @@ -grpcio==1.66.1 +grpcio==1.66.2 protobuf scipy==1.14.0 certifi \ No newline at end of file diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt index b19c59c0..084cc034 100644 --- a/backend/python/transformers/requirements.txt +++ b/backend/python/transformers/requirements.txt @@ -1,4 +1,4 @@ -grpcio==1.66.1 +grpcio==1.66.2 protobuf certifi setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file diff --git a/backend/python/vall-e-x/requirements.txt b/backend/python/vall-e-x/requirements.txt index 8e1b0195..83ae4279 100644 --- a/backend/python/vall-e-x/requirements.txt +++ b/backend/python/vall-e-x/requirements.txt @@ -1,3 +1,3 @@ -grpcio==1.66.1 +grpcio==1.66.2 protobuf certifi \ No newline at end of file diff --git a/backend/python/vllm/requirements.txt b/backend/python/vllm/requirements.txt index b9c192d5..8fb8a418 100644 --- a/backend/python/vllm/requirements.txt +++ b/backend/python/vllm/requirements.txt @@ -1,4 +1,4 @@ -grpcio==1.66.1 +grpcio==1.66.2 protobuf certifi setuptools \ No newline at end of file From 58662db48eaecd2d39d65a0c229a47032a6833d6 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 30 Sep 2024 17:11:54 +0200 Subject: [PATCH 107/122] models(gallery): add calme-2.2-qwen2.5-72b-i1 (#3691) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 2ffbd05b..0924e5cf 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -271,7 +271,30 @@ - filename: Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf sha256: 054d54972259c0398b4e0af3f408f608e1166837b1d7535d08fc440d1daf8639 uri: huggingface://bartowski/Replete-LLM-V2.5-Qwen-7b-GGUF/Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf +- !!merge <<: *qwen25 + name: "calme-2.2-qwen2.5-72b-i1" + icon: https://huggingface.co/MaziyarPanahi/calme-2.2-qwen2.5-72b/resolve/main/calme-2.webp + urls: + - https://huggingface.co/MaziyarPanahi/calme-2.2-qwen2.5-72b + - https://huggingface.co/mradermacher/calme-2.2-qwen2.5-72b-i1-GGUF + description: | + This model is a fine-tuned version of the powerful Qwen/Qwen2.5-72B-Instruct, pushing the boundaries of natural language understanding and generation even further. My goal was to create a versatile and robust model that excels across a wide range of benchmarks and real-world applications. + Use Cases + This model is suitable for a wide range of applications, including but not limited to: + + Advanced question-answering systems + Intelligent chatbots and virtual assistants + Content generation and summarization + Code generation and analysis + Complex problem-solving and decision support + overrides: + parameters: + model: calme-2.2-qwen2.5-72b.i1-Q4_K_M.gguf + files: + - filename: calme-2.2-qwen2.5-72b.i1-Q4_K_M.gguf + sha256: 5fdfa599724d7c78502c477ced1d294e92781b91d3265bd0748fbf15a6fefde6 + uri: huggingface://mradermacher/calme-2.2-qwen2.5-72b-i1-GGUF/calme-2.2-qwen2.5-72b.i1-Q4_K_M.gguf - &smollm ## SmolLM url: "github:mudler/LocalAI/gallery/chatml.yaml@master" From d747f2c89bc71cf4ca57539d68472d7b9e3bf0f7 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 30 Sep 2024 21:08:16 +0000 Subject: [PATCH 108/122] chore(deps): Bump openai from 1.47.1 to 1.50.2 in /examples/langchain-chroma (#3697) chore(deps): Bump openai in /examples/langchain-chroma Bumps [openai](https://github.com/openai/openai-python) from 1.47.1 to 1.50.2. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.47.1...v1.50.2) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 19929482..b6404437 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.3.0 -openai==1.47.1 +openai==1.50.2 chromadb==0.5.7 llama-index==0.11.12 \ No newline at end of file From 164a9e972fed51dae394e01ffd59a9a04b6ee44a Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 01:37:30 +0000 Subject: [PATCH 109/122] chore(deps): Bump chromadb from 0.5.7 to 0.5.11 in /examples/langchain-chroma (#3696) chore(deps): Bump chromadb in /examples/langchain-chroma Bumps [chromadb](https://github.com/chroma-core/chroma) from 0.5.7 to 0.5.11. - [Release notes](https://github.com/chroma-core/chroma/releases) - [Changelog](https://github.com/chroma-core/chroma/blob/main/RELEASE_PROCESS.md) - [Commits](https://github.com/chroma-core/chroma/compare/0.5.7...0.5.11) --- updated-dependencies: - dependency-name: chromadb dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index b6404437..756a6bf3 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.3.0 openai==1.50.2 -chromadb==0.5.7 +chromadb==0.5.11 llama-index==0.11.12 \ No newline at end of file From 32de75c68326758eac7f714fc522eb65c36fde18 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 03:13:37 +0000 Subject: [PATCH 110/122] chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/langchain-chroma (#3694) chore(deps): Bump langchain in /examples/langchain-chroma Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.3.0 to 0.3.1. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.3.0...langchain==0.3.1) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 756a6bf3..fda5f9d8 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ -langchain==0.3.0 +langchain==0.3.1 openai==1.50.2 chromadb==0.5.11 llama-index==0.11.12 \ No newline at end of file From f19277b8e2bc148193650a26927f183bc106c50a Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Tue, 1 Oct 2024 08:47:48 +0200 Subject: [PATCH 111/122] chore: :arrow_up: Update ggerganov/llama.cpp to `6f1d9d71f4c568778a7637ff6582e6f6ba5fb9d3` (#3708) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 8617363c..6c6dbd21 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=c919d5db39c8a7fcb64737f008e4b105ee0acd20 +CPPLLAMA_VERSION?=6f1d9d71f4c568778a7637ff6582e6f6ba5fb9d3 # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From 2908ff3f6b7a63fcd89e0cf5571c0409257209ac Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 08:50:40 +0200 Subject: [PATCH 112/122] chore(deps): Bump securego/gosec from 2.21.0 to 2.21.4 (#3698) Bumps [securego/gosec](https://github.com/securego/gosec) from 2.21.0 to 2.21.4. - [Release notes](https://github.com/securego/gosec/releases) - [Changelog](https://github.com/securego/gosec/blob/master/.goreleaser.yml) - [Commits](https://github.com/securego/gosec/compare/v2.21.0...v2.21.4) --- updated-dependencies: - dependency-name: securego/gosec dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- .github/workflows/secscan.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/secscan.yaml b/.github/workflows/secscan.yaml index db9db586..3fd808e1 100644 --- a/.github/workflows/secscan.yaml +++ b/.github/workflows/secscan.yaml @@ -18,7 +18,7 @@ jobs: if: ${{ github.actor != 'dependabot[bot]' }} - name: Run Gosec Security Scanner if: ${{ github.actor != 'dependabot[bot]' }} - uses: securego/gosec@v2.21.0 + uses: securego/gosec@v2.21.4 with: # we let the report trigger content trigger a failure using the GitHub Security features. args: '-no-fail -fmt sarif -out results.sarif ./...' From 6bd6e2bdeb74e52a291052c4c8b808178ed40d90 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 08:51:07 +0200 Subject: [PATCH 113/122] chore(deps): Bump openai from 1.47.1 to 1.50.2 in /examples/functions (#3699) Bumps [openai](https://github.com/openai/openai-python) from 1.47.1 to 1.50.2. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.47.1...v1.50.2) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/functions/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt index c3ffad01..9ad014fd 100644 --- a/examples/functions/requirements.txt +++ b/examples/functions/requirements.txt @@ -1,2 +1,2 @@ langchain==0.3.0 -openai==1.47.1 +openai==1.50.2 From 44bdacac61a319992f3bf3a32f756a65862617ed Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 08:51:29 +0200 Subject: [PATCH 114/122] chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/langchain/langchainpy-localai-example (#3704) chore(deps): Bump langchain Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.3.0 to 0.3.1. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.3.0...langchain==0.3.1) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 179abc2a..daa467c7 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -10,7 +10,7 @@ debugpy==1.8.2 frozenlist==1.4.1 greenlet==3.1.0 idna==3.10 -langchain==0.3.0 +langchain==0.3.1 langchain-community==0.2.16 marshmallow==3.22.0 marshmallow-enum==1.5.1 From 7d306c6431ddba153704e5513e716288c7d73d09 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 10:39:55 +0200 Subject: [PATCH 115/122] chore(deps): Bump greenlet from 3.1.0 to 3.1.1 in /examples/langchain/langchainpy-localai-example (#3703) chore(deps): Bump greenlet Bumps [greenlet](https://github.com/python-greenlet/greenlet) from 3.1.0 to 3.1.1. - [Changelog](https://github.com/python-greenlet/greenlet/blob/master/CHANGES.rst) - [Commits](https://github.com/python-greenlet/greenlet/compare/3.1.0...3.1.1) --- updated-dependencies: - dependency-name: greenlet dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index daa467c7..205c726c 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -8,7 +8,7 @@ colorama==0.4.6 dataclasses-json==0.6.7 debugpy==1.8.2 frozenlist==1.4.1 -greenlet==3.1.0 +greenlet==3.1.1 idna==3.10 langchain==0.3.1 langchain-community==0.2.16 From d4d2a76f8f4b1379c2c554c911ba64ec1bbda389 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 10:40:08 +0200 Subject: [PATCH 116/122] chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/functions (#3700) Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.3.0 to 0.3.1. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.3.0...langchain==0.3.1) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/functions/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt index 9ad014fd..952f9d62 100644 --- a/examples/functions/requirements.txt +++ b/examples/functions/requirements.txt @@ -1,2 +1,2 @@ -langchain==0.3.0 +langchain==0.3.1 openai==1.50.2 From 76d4e88e0c21b245e74e8ba6e15a5d937d1fdfb0 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 10:40:21 +0200 Subject: [PATCH 117/122] chore(deps): Bump langchain-community from 0.2.16 to 0.3.1 in /examples/langchain/langchainpy-localai-example (#3702) chore(deps): Bump langchain-community Bumps [langchain-community](https://github.com/langchain-ai/langchain) from 0.2.16 to 0.3.1. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain-community==0.2.16...langchain-community==0.3.1) --- updated-dependencies: - dependency-name: langchain-community dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 205c726c..b5f3960e 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -11,7 +11,7 @@ frozenlist==1.4.1 greenlet==3.1.1 idna==3.10 langchain==0.3.1 -langchain-community==0.2.16 +langchain-community==0.3.1 marshmallow==3.22.0 marshmallow-enum==1.5.1 multidict==6.0.5 From 0a8f627cce98be2c4469309b5a54b45b97930b63 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 10:40:36 +0200 Subject: [PATCH 118/122] chore(deps): Bump gradio from 4.38.1 to 4.44.1 in /backend/python/openvoice (#3701) chore(deps): Bump gradio in /backend/python/openvoice Bumps [gradio](https://github.com/gradio-app/gradio) from 4.38.1 to 4.44.1. - [Release notes](https://github.com/gradio-app/gradio/releases) - [Changelog](https://github.com/gradio-app/gradio/blob/main/CHANGELOG.md) - [Commits](https://github.com/gradio-app/gradio/compare/gradio@4.38.1...gradio@4.44.1) --- updated-dependencies: - dependency-name: gradio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/openvoice/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/openvoice/requirements-intel.txt b/backend/python/openvoice/requirements-intel.txt index c568dab1..687efe78 100644 --- a/backend/python/openvoice/requirements-intel.txt +++ b/backend/python/openvoice/requirements-intel.txt @@ -18,6 +18,6 @@ python-dotenv pypinyin==0.50.0 cn2an==0.5.22 jieba==0.42.1 -gradio==4.38.1 +gradio==4.44.1 langid==1.1.6 git+https://github.com/myshell-ai/MeloTTS.git From 2649407f44cf7c1c822fb671c6501ec899d1fc6f Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 10:40:49 +0200 Subject: [PATCH 119/122] chore(deps): Bump llama-index from 0.11.12 to 0.11.14 in /examples/langchain-chroma (#3695) chore(deps): Bump llama-index in /examples/langchain-chroma Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.12 to 0.11.14. - [Release notes](https://github.com/run-llama/llama_index/releases) - [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md) - [Commits](https://github.com/run-llama/llama_index/compare/v0.11.12...v0.11.14) --- updated-dependencies: - dependency-name: llama-index dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index fda5f9d8..d84311b3 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.3.1 openai==1.50.2 chromadb==0.5.11 -llama-index==0.11.12 \ No newline at end of file +llama-index==0.11.14 \ No newline at end of file From 53f406dc35485df76450f65ba11ba548cb86f196 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 10:41:04 +0200 Subject: [PATCH 120/122] chore(deps): Bump aiohttp from 3.10.3 to 3.10.8 in /examples/langchain/langchainpy-localai-example (#3705) chore(deps): Bump aiohttp Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.10.3 to 3.10.8. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.10.3...v3.10.8) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index b5f3960e..53812966 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -1,4 +1,4 @@ -aiohttp==3.10.3 +aiohttp==3.10.8 aiosignal==1.3.1 async-timeout==4.0.3 attrs==24.2.0 From a30058b80f1b23407188a689ec514385ccfa63f9 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 10:41:16 +0200 Subject: [PATCH 121/122] chore(deps): Bump yarl from 1.11.1 to 1.13.1 in /examples/langchain/langchainpy-localai-example (#3706) chore(deps): Bump yarl Bumps [yarl](https://github.com/aio-libs/yarl) from 1.11.1 to 1.13.1. - [Release notes](https://github.com/aio-libs/yarl/releases) - [Changelog](https://github.com/aio-libs/yarl/blob/master/CHANGES.rst) - [Commits](https://github.com/aio-libs/yarl/compare/v1.11.1...v1.13.1) --- updated-dependencies: - dependency-name: yarl dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 53812966..1d48dee8 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -30,4 +30,4 @@ tqdm==4.66.5 typing-inspect==0.9.0 typing_extensions==4.12.2 urllib3==2.2.3 -yarl==1.11.1 +yarl==1.13.1 From 139209353f74100e495471dfbc41f9900a2212fd Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 1 Oct 2024 10:41:30 +0200 Subject: [PATCH 122/122] chore(deps): Bump llama-index from 0.11.12 to 0.11.14 in /examples/chainlit (#3707) chore(deps): Bump llama-index in /examples/chainlit Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.12 to 0.11.14. - [Release notes](https://github.com/run-llama/llama_index/releases) - [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md) - [Commits](https://github.com/run-llama/llama_index/compare/v0.11.12...v0.11.14) --- updated-dependencies: - dependency-name: llama-index dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/chainlit/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/chainlit/requirements.txt b/examples/chainlit/requirements.txt index 92eb113e..ee6c63ac 100644 --- a/examples/chainlit/requirements.txt +++ b/examples/chainlit/requirements.txt @@ -1,4 +1,4 @@ -llama_index==0.11.12 +llama_index==0.11.14 requests==2.32.3 weaviate_client==4.8.1 transformers