From fec01d9e6955a7347cd05c317bd24fd047da0cc6 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 16 Sep 2024 21:00:35 +0000
Subject: [PATCH 001/122] chore(deps): Bump docs/themes/hugo-theme-relearn from
 `f696f60` to `d5a0ee0` (#3558)

chore(deps): Bump docs/themes/hugo-theme-relearn

Bumps [docs/themes/hugo-theme-relearn](https://github.com/McShelby/hugo-theme-relearn) from `f696f60` to `d5a0ee0`.
- [Release notes](https://github.com/McShelby/hugo-theme-relearn/releases)
- [Commits](https://github.com/McShelby/hugo-theme-relearn/compare/f696f60f4e44e18a34512b895a7b65a72c801bd8...d5a0ee04ad986394d6d2f1e1a57f2334d24bf317)

---
updated-dependencies:
- dependency-name: docs/themes/hugo-theme-relearn
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 docs/themes/hugo-theme-relearn | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/themes/hugo-theme-relearn b/docs/themes/hugo-theme-relearn
index f696f60f..d5a0ee04 160000
--- a/docs/themes/hugo-theme-relearn
+++ b/docs/themes/hugo-theme-relearn
@@ -1 +1 @@
-Subproject commit f696f60f4e44e18a34512b895a7b65a72c801bd8
+Subproject commit d5a0ee04ad986394d6d2f1e1a57f2334d24bf317

From 2edc732c3398599b1d86a8930286ccc9fd3762e3 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 16 Sep 2024 21:23:06 +0000
Subject: [PATCH 002/122] chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in
 /backend/python/coqui (#3554)

chore(deps): Bump setuptools in /backend/python/coqui

Bumps [setuptools](https://github.com/pypa/setuptools) from 72.1.0 to 75.1.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v72.1.0...v75.1.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/coqui/requirements-intel.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/coqui/requirements-intel.txt b/backend/python/coqui/requirements-intel.txt
index 002a55c3..c0e4dcaa 100644
--- a/backend/python/coqui/requirements-intel.txt
+++ b/backend/python/coqui/requirements-intel.txt
@@ -3,6 +3,6 @@ intel-extension-for-pytorch
 torch
 torchaudio
 optimum[openvino]
-setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406
+setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406
 transformers
 accelerate
\ No newline at end of file

From a5ce987bdbd98b6c8659a92dfbcc9d99bbf52f5f Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 16 Sep 2024 21:35:10 +0000
Subject: [PATCH 003/122] chore(deps): Bump langchain from 0.2.16 to 0.3.0 in
 /examples/functions (#3559)

Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.16 to 0.3.0.
- [Release notes](https://github.com/langchain-ai/langchain/releases)
- [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.16...langchain==0.3.0)

---
updated-dependencies:
- dependency-name: langchain
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/functions/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt
index 8258885a..9dd6818f 100644
--- a/examples/functions/requirements.txt
+++ b/examples/functions/requirements.txt
@@ -1,2 +1,2 @@
-langchain==0.2.16
+langchain==0.3.0
 openai==1.44.0

From 149cc1eb13d3bd9af76ed13d72bff02cc685e601 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 16 Sep 2024 21:44:34 +0000
Subject: [PATCH 004/122] chore(deps): Bump openai from 1.44.1 to 1.45.1 in
 /examples/langchain-chroma (#3556)

chore(deps): Bump openai in /examples/langchain-chroma

Bumps [openai](https://github.com/openai/openai-python) from 1.44.1 to 1.45.1.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.44.1...v1.45.1)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain-chroma/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt
index c9bce6e9..3edb570c 100644
--- a/examples/langchain-chroma/requirements.txt
+++ b/examples/langchain-chroma/requirements.txt
@@ -1,4 +1,4 @@
 langchain==0.2.16
-openai==1.44.1
+openai==1.45.1
 chromadb==0.5.5
 llama-index==0.11.7
\ No newline at end of file

From 09c7d8d4587f9e09cd6e50534b2677abc85db1ee Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 16 Sep 2024 21:46:26 +0000
Subject: [PATCH 005/122] chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in
 /backend/python/autogptq (#3553)

chore(deps): Bump setuptools in /backend/python/autogptq

Bumps [setuptools](https://github.com/pypa/setuptools) from 72.1.0 to 75.1.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v72.1.0...v75.1.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/autogptq/requirements-intel.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/autogptq/requirements-intel.txt b/backend/python/autogptq/requirements-intel.txt
index 755e19d8..d5e0173e 100644
--- a/backend/python/autogptq/requirements-intel.txt
+++ b/backend/python/autogptq/requirements-intel.txt
@@ -2,4 +2,4 @@
 intel-extension-for-pytorch
 torch
 optimum[openvino]
-setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file
+setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file

From 12a8d0e46fbd03f8d550dc41ea6325d07d66cd00 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 16 Sep 2024 21:57:16 +0000
Subject: [PATCH 006/122] chore(deps): Bump securego/gosec from 2.21.0 to
 2.21.2 (#3561)

Bumps [securego/gosec](https://github.com/securego/gosec) from 2.21.0 to 2.21.2.
- [Release notes](https://github.com/securego/gosec/releases)
- [Changelog](https://github.com/securego/gosec/blob/master/.goreleaser.yml)
- [Commits](https://github.com/securego/gosec/compare/v2.21.0...v2.21.2)

---
updated-dependencies:
- dependency-name: securego/gosec
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 .github/workflows/secscan.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/secscan.yaml b/.github/workflows/secscan.yaml
index db9db586..08d7dfc6 100644
--- a/.github/workflows/secscan.yaml
+++ b/.github/workflows/secscan.yaml
@@ -18,7 +18,7 @@ jobs:
         if: ${{ github.actor != 'dependabot[bot]' }}
       - name: Run Gosec Security Scanner
         if: ${{ github.actor != 'dependabot[bot]' }}
-        uses: securego/gosec@v2.21.0
+        uses: securego/gosec@v2.21.2
         with:
           # we let the report trigger content trigger a failure using the GitHub Security features.
           args: '-no-fail -fmt sarif -out results.sarif ./...'

From afb5bbc1b88f71454a8b6081f8f8d46ad0eb9b35 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 16 Sep 2024 23:03:06 +0000
Subject: [PATCH 007/122] chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in
 /backend/python/transformers-musicgen (#3564)

chore(deps): Bump setuptools in /backend/python/transformers-musicgen

Bumps [setuptools](https://github.com/pypa/setuptools) from 69.5.1 to 75.1.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v69.5.1...v75.1.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/transformers-musicgen/requirements-intel.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/transformers-musicgen/requirements-intel.txt b/backend/python/transformers-musicgen/requirements-intel.txt
index 89bfa6a2..608d6939 100644
--- a/backend/python/transformers-musicgen/requirements-intel.txt
+++ b/backend/python/transformers-musicgen/requirements-intel.txt
@@ -4,4 +4,4 @@ transformers
 accelerate
 torch
 optimum[openvino]
-setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file
+setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file

From 30fe16310035d3942368745f17d1673c889a4ddc Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 16 Sep 2024 23:13:09 +0000
Subject: [PATCH 008/122] chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in
 /backend/python/parler-tts (#3565)

chore(deps): Bump setuptools in /backend/python/parler-tts

Bumps [setuptools](https://github.com/pypa/setuptools) from 72.1.0 to 75.1.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v72.1.0...v75.1.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/parler-tts/requirements-intel.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/parler-tts/requirements-intel.txt b/backend/python/parler-tts/requirements-intel.txt
index 002a55c3..c0e4dcaa 100644
--- a/backend/python/parler-tts/requirements-intel.txt
+++ b/backend/python/parler-tts/requirements-intel.txt
@@ -3,6 +3,6 @@ intel-extension-for-pytorch
 torch
 torchaudio
 optimum[openvino]
-setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406
+setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406
 transformers
 accelerate
\ No newline at end of file

From 5356b81b7f112c57dcc8a215b1f14c86e7ee3f40 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 16 Sep 2024 23:40:39 +0000
Subject: [PATCH 009/122] chore(deps): Bump sentence-transformers from 3.0.1 to
 3.1.0 in /backend/python/sentencetransformers (#3566)

chore(deps): Bump sentence-transformers

Bumps [sentence-transformers](https://github.com/UKPLab/sentence-transformers) from 3.0.1 to 3.1.0.
- [Release notes](https://github.com/UKPLab/sentence-transformers/releases)
- [Commits](https://github.com/UKPLab/sentence-transformers/compare/v3.0.1...v3.1.0)

---
updated-dependencies:
- dependency-name: sentence-transformers
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/sentencetransformers/requirements-cpu.txt      | 2 +-
 backend/python/sentencetransformers/requirements-cublas11.txt | 2 +-
 backend/python/sentencetransformers/requirements-cublas12.txt | 2 +-
 backend/python/sentencetransformers/requirements-hipblas.txt  | 2 +-
 backend/python/sentencetransformers/requirements-intel.txt    | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/backend/python/sentencetransformers/requirements-cpu.txt b/backend/python/sentencetransformers/requirements-cpu.txt
index cd9924ef..f88de1e4 100644
--- a/backend/python/sentencetransformers/requirements-cpu.txt
+++ b/backend/python/sentencetransformers/requirements-cpu.txt
@@ -2,5 +2,5 @@ torch
 accelerate
 transformers
 bitsandbytes
-sentence-transformers==3.0.1
+sentence-transformers==3.1.0
 transformers
\ No newline at end of file
diff --git a/backend/python/sentencetransformers/requirements-cublas11.txt b/backend/python/sentencetransformers/requirements-cublas11.txt
index 1131f066..57caf1a1 100644
--- a/backend/python/sentencetransformers/requirements-cublas11.txt
+++ b/backend/python/sentencetransformers/requirements-cublas11.txt
@@ -1,5 +1,5 @@
 --extra-index-url https://download.pytorch.org/whl/cu118
 torch
 accelerate
-sentence-transformers==3.0.1
+sentence-transformers==3.1.0
 transformers
\ No newline at end of file
diff --git a/backend/python/sentencetransformers/requirements-cublas12.txt b/backend/python/sentencetransformers/requirements-cublas12.txt
index 2936e17b..834fa6a4 100644
--- a/backend/python/sentencetransformers/requirements-cublas12.txt
+++ b/backend/python/sentencetransformers/requirements-cublas12.txt
@@ -1,4 +1,4 @@
 torch
 accelerate
-sentence-transformers==3.0.1
+sentence-transformers==3.1.0
 transformers
\ No newline at end of file
diff --git a/backend/python/sentencetransformers/requirements-hipblas.txt b/backend/python/sentencetransformers/requirements-hipblas.txt
index 3b187c68..98a0a41b 100644
--- a/backend/python/sentencetransformers/requirements-hipblas.txt
+++ b/backend/python/sentencetransformers/requirements-hipblas.txt
@@ -1,5 +1,5 @@
 --extra-index-url https://download.pytorch.org/whl/rocm6.0
 torch
 accelerate
-sentence-transformers==3.0.1
+sentence-transformers==3.1.0
 transformers
\ No newline at end of file
diff --git a/backend/python/sentencetransformers/requirements-intel.txt b/backend/python/sentencetransformers/requirements-intel.txt
index 806e3d47..5948910d 100644
--- a/backend/python/sentencetransformers/requirements-intel.txt
+++ b/backend/python/sentencetransformers/requirements-intel.txt
@@ -4,5 +4,5 @@ torch
 optimum[openvino]
 setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406
 accelerate
-sentence-transformers==3.0.1
+sentence-transformers==3.1.0
 transformers
\ No newline at end of file

From c866b77586f25340d98a9fbb2ad16e22d5e4d577 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 00:02:42 +0000
Subject: [PATCH 010/122] chore(deps): Bump llama-index from 0.11.7 to 0.11.9
 in /examples/chainlit (#3567)

chore(deps): Bump llama-index in /examples/chainlit

Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.7 to 0.11.9.
- [Release notes](https://github.com/run-llama/llama_index/releases)
- [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md)
- [Commits](https://github.com/run-llama/llama_index/compare/v0.11.7...v0.11.9)

---
updated-dependencies:
- dependency-name: llama-index
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/chainlit/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/chainlit/requirements.txt b/examples/chainlit/requirements.txt
index 69212e28..df8bea7f 100644
--- a/examples/chainlit/requirements.txt
+++ b/examples/chainlit/requirements.txt
@@ -1,4 +1,4 @@
-llama_index==0.11.7
+llama_index==0.11.9
 requests==2.32.3
 weaviate_client==4.6.7
 transformers

From 42d6b9e0ccc75fd3ecfb6275b0fe50236fdfc9f1 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 00:11:15 +0000
Subject: [PATCH 011/122] chore(deps): Bump weaviate-client from 4.6.7 to 4.8.1
 in /examples/chainlit (#3568)

chore(deps): Bump weaviate-client in /examples/chainlit

Bumps [weaviate-client](https://github.com/weaviate/weaviate-python-client) from 4.6.7 to 4.8.1.
- [Release notes](https://github.com/weaviate/weaviate-python-client/releases)
- [Changelog](https://github.com/weaviate/weaviate-python-client/blob/main/docs/changelog.rst)
- [Commits](https://github.com/weaviate/weaviate-python-client/compare/v4.6.7...v4.8.1)

---
updated-dependencies:
- dependency-name: weaviate-client
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/chainlit/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/chainlit/requirements.txt b/examples/chainlit/requirements.txt
index df8bea7f..1fe9356a 100644
--- a/examples/chainlit/requirements.txt
+++ b/examples/chainlit/requirements.txt
@@ -1,6 +1,6 @@
 llama_index==0.11.9
 requests==2.32.3
-weaviate_client==4.6.7
+weaviate_client==4.8.1
 transformers
 torch
 chainlit

From abc27e0dc49dfba0ef5436c08acf5c5959f354ea Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 00:51:55 +0000
Subject: [PATCH 012/122] chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in
 /backend/python/vall-e-x (#3570)

chore(deps): Bump setuptools in /backend/python/vall-e-x

Bumps [setuptools](https://github.com/pypa/setuptools) from 72.1.0 to 75.1.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v72.1.0...v75.1.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/vall-e-x/requirements-intel.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/vall-e-x/requirements-intel.txt b/backend/python/vall-e-x/requirements-intel.txt
index 6185314f..adbabeac 100644
--- a/backend/python/vall-e-x/requirements-intel.txt
+++ b/backend/python/vall-e-x/requirements-intel.txt
@@ -4,4 +4,4 @@ accelerate
 torch
 torchaudio
 optimum[openvino]
-setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file
+setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file

From 36e19928eb2ad4f4976454873f101112d131b564 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 01:14:39 +0000
Subject: [PATCH 013/122] chore(deps): Bump greenlet from 3.0.3 to 3.1.0 in
 /examples/langchain/langchainpy-localai-example (#3571)

chore(deps): Bump greenlet

Bumps [greenlet](https://github.com/python-greenlet/greenlet) from 3.0.3 to 3.1.0.
- [Changelog](https://github.com/python-greenlet/greenlet/blob/master/CHANGES.rst)
- [Commits](https://github.com/python-greenlet/greenlet/compare/3.0.3...3.1.0)

---
updated-dependencies:
- dependency-name: greenlet
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index 75323005..1bd6b841 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -8,7 +8,7 @@ colorama==0.4.6
 dataclasses-json==0.6.7
 debugpy==1.8.2
 frozenlist==1.4.1
-greenlet==3.0.3
+greenlet==3.1.0
 idna==3.8
 langchain==0.2.16
 langchain-community==0.2.16

From 2394f7833fab174663231b722e5de964446d2cbf Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 02:28:05 +0000
Subject: [PATCH 014/122] chore(deps): Bump setuptools from 70.3.0 to 75.1.0 in
 /backend/python/diffusers (#3575)

chore(deps): Bump setuptools in /backend/python/diffusers

Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 75.1.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v75.1.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/diffusers/requirements-intel.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/diffusers/requirements-intel.txt b/backend/python/diffusers/requirements-intel.txt
index 1cc2e2a2..566278a8 100644
--- a/backend/python/diffusers/requirements-intel.txt
+++ b/backend/python/diffusers/requirements-intel.txt
@@ -3,7 +3,7 @@ intel-extension-for-pytorch
 torch
 torchvision
 optimum[openvino]
-setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406
+setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406
 diffusers
 opencv-python
 transformers

From 06c83398624549fba12e5ec975c2c25a0e7e649a Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 02:32:33 +0000
Subject: [PATCH 015/122] chore(deps): Bump setuptools from 70.3.0 to 75.1.0 in
 /backend/python/bark (#3574)

chore(deps): Bump setuptools in /backend/python/bark

Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 75.1.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v75.1.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/bark/requirements-intel.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/bark/requirements-intel.txt b/backend/python/bark/requirements-intel.txt
index 9feb6eef..c0e4dcaa 100644
--- a/backend/python/bark/requirements-intel.txt
+++ b/backend/python/bark/requirements-intel.txt
@@ -3,6 +3,6 @@ intel-extension-for-pytorch
 torch
 torchaudio
 optimum[openvino]
-setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406
+setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406
 transformers
 accelerate
\ No newline at end of file

From a9a3a07c3bf22b2a3741471f6122876c65d8909a Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 03:24:30 +0000
Subject: [PATCH 016/122] chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in
 /backend/python/rerankers (#3578)

chore(deps): Bump setuptools in /backend/python/rerankers

Bumps [setuptools](https://github.com/pypa/setuptools) from 72.1.0 to 75.1.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v72.1.0...v75.1.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/rerankers/requirements-intel.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/rerankers/requirements-intel.txt b/backend/python/rerankers/requirements-intel.txt
index 1a39cf4f..e6bb4cc7 100644
--- a/backend/python/rerankers/requirements-intel.txt
+++ b/backend/python/rerankers/requirements-intel.txt
@@ -5,4 +5,4 @@ accelerate
 torch
 rerankers[transformers]
 optimum[openvino]
-setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file
+setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file

From db1159b6511e8fa09e594f9db0fec6ab4e142468 Mon Sep 17 00:00:00 2001
From: Dave <dave@gray101.com>
Date: Mon, 16 Sep 2024 23:29:07 -0400
Subject: [PATCH 017/122] feat: auth v2 - supersedes #2894 (#3476)

feat: auth v2 - supercedes #2894, metrics to follow later

Signed-off-by: Dave Lee <dave@gray101.com>
---
 core/cli/run.go                   | 56 ++++++++++---------
 core/config/application_config.go | 40 +++++++++++--
 core/http/app.go                  | 49 +++++-----------
 core/http/middleware/auth.go      | 93 +++++++++++++++++++++++++++++++
 core/http/routes/elevenlabs.go    |  7 +--
 core/http/routes/jina.go          |  3 +-
 core/http/routes/localai.go       | 41 +++++++-------
 core/http/routes/openai.go        | 89 +++++++++++++++--------------
 core/http/routes/ui.go            | 41 +++++++-------
 go.mod                            |  1 +
 go.sum                            |  2 +
 11 files changed, 264 insertions(+), 158 deletions(-)
 create mode 100644 core/http/middleware/auth.go

diff --git a/core/cli/run.go b/core/cli/run.go
index 55ae0fd5..afb7204c 100644
--- a/core/cli/run.go
+++ b/core/cli/run.go
@@ -41,31 +41,34 @@ type RunCMD struct {
 	Threads     int  `env:"LOCALAI_THREADS,THREADS" short:"t" help:"Number of threads used for parallel computation. Usage of the number of physical cores in the system is suggested" group:"performance"`
 	ContextSize int  `env:"LOCALAI_CONTEXT_SIZE,CONTEXT_SIZE" default:"512" help:"Default context size for models" group:"performance"`
 
-	Address                string   `env:"LOCALAI_ADDRESS,ADDRESS" default:":8080" help:"Bind address for the API server" group:"api"`
-	CORS                   bool     `env:"LOCALAI_CORS,CORS" help:"" group:"api"`
-	CORSAllowOrigins       string   `env:"LOCALAI_CORS_ALLOW_ORIGINS,CORS_ALLOW_ORIGINS" group:"api"`
-	LibraryPath            string   `env:"LOCALAI_LIBRARY_PATH,LIBRARY_PATH" help:"Path to the library directory (for e.g. external libraries used by backends)" default:"/usr/share/local-ai/libs" group:"backends"`
-	CSRF                   bool     `env:"LOCALAI_CSRF" help:"Enables fiber CSRF middleware" group:"api"`
-	UploadLimit            int      `env:"LOCALAI_UPLOAD_LIMIT,UPLOAD_LIMIT" default:"15" help:"Default upload-limit in MB" group:"api"`
-	APIKeys                []string `env:"LOCALAI_API_KEY,API_KEY" help:"List of API Keys to enable API authentication. When this is set, all the requests must be authenticated with one of these API keys" group:"api"`
-	DisableWebUI           bool     `env:"LOCALAI_DISABLE_WEBUI,DISABLE_WEBUI" default:"false" help:"Disable webui" group:"api"`
-	DisablePredownloadScan bool     `env:"LOCALAI_DISABLE_PREDOWNLOAD_SCAN" help:"If true, disables the best-effort security scanner before downloading any files." group:"hardening" default:"false"`
-	OpaqueErrors           bool     `env:"LOCALAI_OPAQUE_ERRORS" default:"false" help:"If true, all error responses are replaced with blank 500 errors. This is intended only for hardening against information leaks and is normally not recommended." group:"hardening"`
-	Peer2Peer              bool     `env:"LOCALAI_P2P,P2P" name:"p2p" default:"false" help:"Enable P2P mode" group:"p2p"`
-	Peer2PeerDHTInterval   int      `env:"LOCALAI_P2P_DHT_INTERVAL,P2P_DHT_INTERVAL" default:"360" name:"p2p-dht-interval" help:"Interval for DHT refresh (used during token generation)" group:"p2p"`
-	Peer2PeerOTPInterval   int      `env:"LOCALAI_P2P_OTP_INTERVAL,P2P_OTP_INTERVAL" default:"9000" name:"p2p-otp-interval" help:"Interval for OTP refresh (used during token generation)" group:"p2p"`
-	Peer2PeerToken         string   `env:"LOCALAI_P2P_TOKEN,P2P_TOKEN,TOKEN" name:"p2ptoken" help:"Token for P2P mode (optional)" group:"p2p"`
-	Peer2PeerNetworkID     string   `env:"LOCALAI_P2P_NETWORK_ID,P2P_NETWORK_ID" help:"Network ID for P2P mode, can be set arbitrarly by the user for grouping a set of instances" group:"p2p"`
-	ParallelRequests       bool     `env:"LOCALAI_PARALLEL_REQUESTS,PARALLEL_REQUESTS" help:"Enable backends to handle multiple requests in parallel if they support it (e.g.: llama.cpp or vllm)" group:"backends"`
-	SingleActiveBackend    bool     `env:"LOCALAI_SINGLE_ACTIVE_BACKEND,SINGLE_ACTIVE_BACKEND" help:"Allow only one backend to be run at a time" group:"backends"`
-	PreloadBackendOnly     bool     `env:"LOCALAI_PRELOAD_BACKEND_ONLY,PRELOAD_BACKEND_ONLY" default:"false" help:"Do not launch the API services, only the preloaded models / backends are started (useful for multi-node setups)" group:"backends"`
-	ExternalGRPCBackends   []string `env:"LOCALAI_EXTERNAL_GRPC_BACKENDS,EXTERNAL_GRPC_BACKENDS" help:"A list of external grpc backends" group:"backends"`
-	EnableWatchdogIdle     bool     `env:"LOCALAI_WATCHDOG_IDLE,WATCHDOG_IDLE" default:"false" help:"Enable watchdog for stopping backends that are idle longer than the watchdog-idle-timeout" group:"backends"`
-	WatchdogIdleTimeout    string   `env:"LOCALAI_WATCHDOG_IDLE_TIMEOUT,WATCHDOG_IDLE_TIMEOUT" default:"15m" help:"Threshold beyond which an idle backend should be stopped" group:"backends"`
-	EnableWatchdogBusy     bool     `env:"LOCALAI_WATCHDOG_BUSY,WATCHDOG_BUSY" default:"false" help:"Enable watchdog for stopping backends that are busy longer than the watchdog-busy-timeout" group:"backends"`
-	WatchdogBusyTimeout    string   `env:"LOCALAI_WATCHDOG_BUSY_TIMEOUT,WATCHDOG_BUSY_TIMEOUT" default:"5m" help:"Threshold beyond which a busy backend should be stopped" group:"backends"`
-	Federated              bool     `env:"LOCALAI_FEDERATED,FEDERATED" help:"Enable federated instance" group:"federated"`
-	DisableGalleryEndpoint bool     `env:"LOCALAI_DISABLE_GALLERY_ENDPOINT,DISABLE_GALLERY_ENDPOINT" help:"Disable the gallery endpoints" group:"api"`
+	Address                            string   `env:"LOCALAI_ADDRESS,ADDRESS" default:":8080" help:"Bind address for the API server" group:"api"`
+	CORS                               bool     `env:"LOCALAI_CORS,CORS" help:"" group:"api"`
+	CORSAllowOrigins                   string   `env:"LOCALAI_CORS_ALLOW_ORIGINS,CORS_ALLOW_ORIGINS" group:"api"`
+	LibraryPath                        string   `env:"LOCALAI_LIBRARY_PATH,LIBRARY_PATH" help:"Path to the library directory (for e.g. external libraries used by backends)" default:"/usr/share/local-ai/libs" group:"backends"`
+	CSRF                               bool     `env:"LOCALAI_CSRF" help:"Enables fiber CSRF middleware" group:"api"`
+	UploadLimit                        int      `env:"LOCALAI_UPLOAD_LIMIT,UPLOAD_LIMIT" default:"15" help:"Default upload-limit in MB" group:"api"`
+	APIKeys                            []string `env:"LOCALAI_API_KEY,API_KEY" help:"List of API Keys to enable API authentication. When this is set, all the requests must be authenticated with one of these API keys" group:"api"`
+	DisableWebUI                       bool     `env:"LOCALAI_DISABLE_WEBUI,DISABLE_WEBUI" default:"false" help:"Disable webui" group:"api"`
+	DisablePredownloadScan             bool     `env:"LOCALAI_DISABLE_PREDOWNLOAD_SCAN" help:"If true, disables the best-effort security scanner before downloading any files." group:"hardening" default:"false"`
+	OpaqueErrors                       bool     `env:"LOCALAI_OPAQUE_ERRORS" default:"false" help:"If true, all error responses are replaced with blank 500 errors. This is intended only for hardening against information leaks and is normally not recommended." group:"hardening"`
+	UseSubtleKeyComparison             bool     `env:"LOCALAI_SUBTLE_KEY_COMPARISON" default:"false" help:"If true, API Key validation comparisons will be performed using constant-time comparisons rather than simple equality. This trades off performance on each request for resiliancy against timing attacks." group:"hardening"`
+	DisableApiKeyRequirementForHttpGet bool     `env:"LOCALAI_DISABLE_API_KEY_REQUIREMENT_FOR_HTTP_GET" default:"false" help:"If true, a valid API key is not required to issue GET requests to portions of the web ui. This should only be enabled in secure testing environments" group:"hardening"`
+	HttpGetExemptedEndpoints           []string `env:"LOCALAI_HTTP_GET_EXEMPTED_ENDPOINTS" default:"^/$,^/browse/?$,^/talk/?$,^/p2p/?$,^/chat/?$,^/text2image/?$,^/tts/?$,^/static/.*$,^/swagger.*$" help:"If LOCALAI_DISABLE_API_KEY_REQUIREMENT_FOR_HTTP_GET is overriden to true, this is the list of endpoints to exempt. Only adjust this in case of a security incident or as a result of a personal security posture review" group:"hardening"`
+	Peer2Peer                          bool     `env:"LOCALAI_P2P,P2P" name:"p2p" default:"false" help:"Enable P2P mode" group:"p2p"`
+	Peer2PeerDHTInterval               int      `env:"LOCALAI_P2P_DHT_INTERVAL,P2P_DHT_INTERVAL" default:"360" name:"p2p-dht-interval" help:"Interval for DHT refresh (used during token generation)" group:"p2p"`
+	Peer2PeerOTPInterval               int      `env:"LOCALAI_P2P_OTP_INTERVAL,P2P_OTP_INTERVAL" default:"9000" name:"p2p-otp-interval" help:"Interval for OTP refresh (used during token generation)" group:"p2p"`
+	Peer2PeerToken                     string   `env:"LOCALAI_P2P_TOKEN,P2P_TOKEN,TOKEN" name:"p2ptoken" help:"Token for P2P mode (optional)" group:"p2p"`
+	Peer2PeerNetworkID                 string   `env:"LOCALAI_P2P_NETWORK_ID,P2P_NETWORK_ID" help:"Network ID for P2P mode, can be set arbitrarly by the user for grouping a set of instances" group:"p2p"`
+	ParallelRequests                   bool     `env:"LOCALAI_PARALLEL_REQUESTS,PARALLEL_REQUESTS" help:"Enable backends to handle multiple requests in parallel if they support it (e.g.: llama.cpp or vllm)" group:"backends"`
+	SingleActiveBackend                bool     `env:"LOCALAI_SINGLE_ACTIVE_BACKEND,SINGLE_ACTIVE_BACKEND" help:"Allow only one backend to be run at a time" group:"backends"`
+	PreloadBackendOnly                 bool     `env:"LOCALAI_PRELOAD_BACKEND_ONLY,PRELOAD_BACKEND_ONLY" default:"false" help:"Do not launch the API services, only the preloaded models / backends are started (useful for multi-node setups)" group:"backends"`
+	ExternalGRPCBackends               []string `env:"LOCALAI_EXTERNAL_GRPC_BACKENDS,EXTERNAL_GRPC_BACKENDS" help:"A list of external grpc backends" group:"backends"`
+	EnableWatchdogIdle                 bool     `env:"LOCALAI_WATCHDOG_IDLE,WATCHDOG_IDLE" default:"false" help:"Enable watchdog for stopping backends that are idle longer than the watchdog-idle-timeout" group:"backends"`
+	WatchdogIdleTimeout                string   `env:"LOCALAI_WATCHDOG_IDLE_TIMEOUT,WATCHDOG_IDLE_TIMEOUT" default:"15m" help:"Threshold beyond which an idle backend should be stopped" group:"backends"`
+	EnableWatchdogBusy                 bool     `env:"LOCALAI_WATCHDOG_BUSY,WATCHDOG_BUSY" default:"false" help:"Enable watchdog for stopping backends that are busy longer than the watchdog-busy-timeout" group:"backends"`
+	WatchdogBusyTimeout                string   `env:"LOCALAI_WATCHDOG_BUSY_TIMEOUT,WATCHDOG_BUSY_TIMEOUT" default:"5m" help:"Threshold beyond which a busy backend should be stopped" group:"backends"`
+	Federated                          bool     `env:"LOCALAI_FEDERATED,FEDERATED" help:"Enable federated instance" group:"federated"`
+	DisableGalleryEndpoint             bool     `env:"LOCALAI_DISABLE_GALLERY_ENDPOINT,DISABLE_GALLERY_ENDPOINT" help:"Disable the gallery endpoints" group:"api"`
 }
 
 func (r *RunCMD) Run(ctx *cliContext.Context) error {
@@ -97,6 +100,9 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error {
 		config.WithModelsURL(append(r.Models, r.ModelArgs...)...),
 		config.WithOpaqueErrors(r.OpaqueErrors),
 		config.WithEnforcedPredownloadScans(!r.DisablePredownloadScan),
+		config.WithSubtleKeyComparison(r.UseSubtleKeyComparison),
+		config.WithDisableApiKeyRequirementForHttpGet(r.DisableApiKeyRequirementForHttpGet),
+		config.WithHttpGetExemptedEndpoints(r.HttpGetExemptedEndpoints),
 		config.WithP2PNetworkID(r.Peer2PeerNetworkID),
 	}
 
diff --git a/core/config/application_config.go b/core/config/application_config.go
index 947c4f13..afbf325f 100644
--- a/core/config/application_config.go
+++ b/core/config/application_config.go
@@ -4,6 +4,7 @@ import (
 	"context"
 	"embed"
 	"encoding/json"
+	"regexp"
 	"time"
 
 	"github.com/mudler/LocalAI/pkg/xsysinfo"
@@ -16,7 +17,6 @@ type ApplicationConfig struct {
 	ModelPath                           string
 	LibPath                             string
 	UploadLimitMB, Threads, ContextSize int
-	DisableWebUI                        bool
 	F16                                 bool
 	Debug                               bool
 	ImageDir                            string
@@ -31,11 +31,17 @@ type ApplicationConfig struct {
 	PreloadModelsFromPath               string
 	CORSAllowOrigins                    string
 	ApiKeys                             []string
-	EnforcePredownloadScans             bool
-	OpaqueErrors                        bool
 	P2PToken                            string
 	P2PNetworkID                        string
 
+	DisableWebUI                       bool
+	EnforcePredownloadScans            bool
+	OpaqueErrors                       bool
+	UseSubtleKeyComparison             bool
+	DisableApiKeyRequirementForHttpGet bool
+	HttpGetExemptedEndpoints           []*regexp.Regexp
+	DisableGalleryEndpoint             bool
+
 	ModelLibraryURL string
 
 	Galleries []Gallery
@@ -57,8 +63,6 @@ type ApplicationConfig struct {
 	ModelsURL []string
 
 	WatchDogBusyTimeout, WatchDogIdleTimeout time.Duration
-
-	DisableGalleryEndpoint bool
 }
 
 type AppOption func(*ApplicationConfig)
@@ -327,6 +331,32 @@ func WithOpaqueErrors(opaque bool) AppOption {
 	}
 }
 
+func WithSubtleKeyComparison(subtle bool) AppOption {
+	return func(o *ApplicationConfig) {
+		o.UseSubtleKeyComparison = subtle
+	}
+}
+
+func WithDisableApiKeyRequirementForHttpGet(required bool) AppOption {
+	return func(o *ApplicationConfig) {
+		o.DisableApiKeyRequirementForHttpGet = required
+	}
+}
+
+func WithHttpGetExemptedEndpoints(endpoints []string) AppOption {
+	return func(o *ApplicationConfig) {
+		o.HttpGetExemptedEndpoints = []*regexp.Regexp{}
+		for _, epr := range endpoints {
+			r, err := regexp.Compile(epr)
+			if err == nil && r != nil {
+				o.HttpGetExemptedEndpoints = append(o.HttpGetExemptedEndpoints, r)
+			} else {
+				log.Warn().Err(err).Str("regex", epr).Msg("Error while compiling HTTP Get Exemption regex, skipping this entry.")
+			}
+		}
+	}
+}
+
 // ToConfigLoaderOptions returns a slice of ConfigLoader Option.
 // Some options defined at the application level are going to be passed as defaults for
 // all the configuration for the models.
diff --git a/core/http/app.go b/core/http/app.go
index 6eb9c956..fa9cd866 100644
--- a/core/http/app.go
+++ b/core/http/app.go
@@ -3,13 +3,15 @@ package http
 import (
 	"embed"
 	"errors"
+	"fmt"
 	"net/http"
-	"strings"
 
+	"github.com/dave-gray101/v2keyauth"
 	"github.com/mudler/LocalAI/pkg/utils"
 
 	"github.com/mudler/LocalAI/core/http/endpoints/localai"
 	"github.com/mudler/LocalAI/core/http/endpoints/openai"
+	"github.com/mudler/LocalAI/core/http/middleware"
 	"github.com/mudler/LocalAI/core/http/routes"
 
 	"github.com/mudler/LocalAI/core/config"
@@ -137,37 +139,14 @@ func App(cl *config.BackendConfigLoader, ml *model.ModelLoader, appConfig *confi
 		})
 	}
 
-	// Auth middleware checking if API key is valid. If no API key is set, no auth is required.
-	auth := func(c *fiber.Ctx) error {
-		if len(appConfig.ApiKeys) == 0 {
-			return c.Next()
-		}
-
-		if len(appConfig.ApiKeys) == 0 {
-			return c.Next()
-		}
-
-		authHeader := readAuthHeader(c)
-		if authHeader == "" {
-			return c.Status(fiber.StatusUnauthorized).JSON(fiber.Map{"message": "Authorization header missing"})
-		}
-
-		// If it's a bearer token
-		authHeaderParts := strings.Split(authHeader, " ")
-		if len(authHeaderParts) != 2 || authHeaderParts[0] != "Bearer" {
-			return c.Status(fiber.StatusUnauthorized).JSON(fiber.Map{"message": "Invalid Authorization header format"})
-		}
-
-		apiKey := authHeaderParts[1]
-		for _, key := range appConfig.ApiKeys {
-			if apiKey == key {
-				return c.Next()
-			}
-		}
-
-		return c.Status(fiber.StatusUnauthorized).JSON(fiber.Map{"message": "Invalid API key"})
+	kaConfig, err := middleware.GetKeyAuthConfig(appConfig)
+	if err != nil || kaConfig == nil {
+		return nil, fmt.Errorf("failed to create key auth config: %w", err)
 	}
 
+	// Auth is applied to _all_ endpoints. No exceptions. Filtering out endpoints to bypass is the role of the Filter property of the KeyAuth Configuration
+	app.Use(v2keyauth.New(*kaConfig))
+
 	if appConfig.CORS {
 		var c func(ctx *fiber.Ctx) error
 		if appConfig.CORSAllowOrigins == "" {
@@ -192,13 +171,13 @@ func App(cl *config.BackendConfigLoader, ml *model.ModelLoader, appConfig *confi
 	galleryService := services.NewGalleryService(appConfig)
 	galleryService.Start(appConfig.Context, cl)
 
-	routes.RegisterElevenLabsRoutes(app, cl, ml, appConfig, auth)
-	routes.RegisterLocalAIRoutes(app, cl, ml, appConfig, galleryService, auth)
-	routes.RegisterOpenAIRoutes(app, cl, ml, appConfig, auth)
+	routes.RegisterElevenLabsRoutes(app, cl, ml, appConfig)
+	routes.RegisterLocalAIRoutes(app, cl, ml, appConfig, galleryService)
+	routes.RegisterOpenAIRoutes(app, cl, ml, appConfig)
 	if !appConfig.DisableWebUI {
-		routes.RegisterUIRoutes(app, cl, ml, appConfig, galleryService, auth)
+		routes.RegisterUIRoutes(app, cl, ml, appConfig, galleryService)
 	}
-	routes.RegisterJINARoutes(app, cl, ml, appConfig, auth)
+	routes.RegisterJINARoutes(app, cl, ml, appConfig)
 
 	httpFS := http.FS(embedDirStatic)
 
diff --git a/core/http/middleware/auth.go b/core/http/middleware/auth.go
new file mode 100644
index 00000000..bc8bcf80
--- /dev/null
+++ b/core/http/middleware/auth.go
@@ -0,0 +1,93 @@
+package middleware
+
+import (
+	"crypto/subtle"
+	"errors"
+
+	"github.com/dave-gray101/v2keyauth"
+	"github.com/gofiber/fiber/v2"
+	"github.com/gofiber/fiber/v2/middleware/keyauth"
+	"github.com/mudler/LocalAI/core/config"
+)
+
+// This file contains the configuration generators and handler functions that are used along with the fiber/keyauth middleware
+// Currently this requires an upstream patch - and feature patches are no longer accepted to v2
+// Therefore `dave-gray101/v2keyauth` contains the v2 backport of the middleware until v3 stabilizes and we migrate.
+
+func GetKeyAuthConfig(applicationConfig *config.ApplicationConfig) (*v2keyauth.Config, error) {
+	customLookup, err := v2keyauth.MultipleKeySourceLookup([]string{"header:Authorization", "header:x-api-key", "header:xi-api-key"}, keyauth.ConfigDefault.AuthScheme)
+	if err != nil {
+		return nil, err
+	}
+
+	return &v2keyauth.Config{
+		CustomKeyLookup: customLookup,
+		Next:            getApiKeyRequiredFilterFunction(applicationConfig),
+		Validator:       getApiKeyValidationFunction(applicationConfig),
+		ErrorHandler:    getApiKeyErrorHandler(applicationConfig),
+		AuthScheme:      "Bearer",
+	}, nil
+}
+
+func getApiKeyErrorHandler(applicationConfig *config.ApplicationConfig) fiber.ErrorHandler {
+	return func(ctx *fiber.Ctx, err error) error {
+		if errors.Is(err, v2keyauth.ErrMissingOrMalformedAPIKey) {
+			if len(applicationConfig.ApiKeys) == 0 {
+				return ctx.Next() // if no keys are set up, any error we get here is not an error.
+			}
+			if applicationConfig.OpaqueErrors {
+				return ctx.SendStatus(403)
+			}
+		}
+		if applicationConfig.OpaqueErrors {
+			return ctx.SendStatus(500)
+		}
+		return err
+	}
+}
+
+func getApiKeyValidationFunction(applicationConfig *config.ApplicationConfig) func(*fiber.Ctx, string) (bool, error) {
+
+	if applicationConfig.UseSubtleKeyComparison {
+		return func(ctx *fiber.Ctx, apiKey string) (bool, error) {
+			if len(applicationConfig.ApiKeys) == 0 {
+				return true, nil // If no keys are setup, accept everything
+			}
+			for _, validKey := range applicationConfig.ApiKeys {
+				if subtle.ConstantTimeCompare([]byte(apiKey), []byte(validKey)) == 1 {
+					return true, nil
+				}
+			}
+			return false, v2keyauth.ErrMissingOrMalformedAPIKey
+		}
+	}
+
+	return func(ctx *fiber.Ctx, apiKey string) (bool, error) {
+		if len(applicationConfig.ApiKeys) == 0 {
+			return true, nil // If no keys are setup, accept everything
+		}
+		for _, validKey := range applicationConfig.ApiKeys {
+			if apiKey == validKey {
+				return true, nil
+			}
+		}
+		return false, v2keyauth.ErrMissingOrMalformedAPIKey
+	}
+}
+
+func getApiKeyRequiredFilterFunction(applicationConfig *config.ApplicationConfig) func(*fiber.Ctx) bool {
+	if applicationConfig.DisableApiKeyRequirementForHttpGet {
+		return func(c *fiber.Ctx) bool {
+			if c.Method() != "GET" {
+				return false
+			}
+			for _, rx := range applicationConfig.HttpGetExemptedEndpoints {
+				if rx.MatchString(c.Path()) {
+					return true
+				}
+			}
+			return false
+		}
+	}
+	return func(c *fiber.Ctx) bool { return false }
+}
\ No newline at end of file
diff --git a/core/http/routes/elevenlabs.go b/core/http/routes/elevenlabs.go
index b20dec75..73387c7b 100644
--- a/core/http/routes/elevenlabs.go
+++ b/core/http/routes/elevenlabs.go
@@ -10,12 +10,11 @@ import (
 func RegisterElevenLabsRoutes(app *fiber.App,
 	cl *config.BackendConfigLoader,
 	ml *model.ModelLoader,
-	appConfig *config.ApplicationConfig,
-	auth func(*fiber.Ctx) error) {
+	appConfig *config.ApplicationConfig) {
 
 	// Elevenlabs
-	app.Post("/v1/text-to-speech/:voice-id", auth, elevenlabs.TTSEndpoint(cl, ml, appConfig))
+	app.Post("/v1/text-to-speech/:voice-id", elevenlabs.TTSEndpoint(cl, ml, appConfig))
 
-	app.Post("/v1/sound-generation", auth, elevenlabs.SoundGenerationEndpoint(cl, ml, appConfig))
+	app.Post("/v1/sound-generation", elevenlabs.SoundGenerationEndpoint(cl, ml, appConfig))
 
 }
diff --git a/core/http/routes/jina.go b/core/http/routes/jina.go
index 92f29224..93125e6c 100644
--- a/core/http/routes/jina.go
+++ b/core/http/routes/jina.go
@@ -11,8 +11,7 @@ import (
 func RegisterJINARoutes(app *fiber.App,
 	cl *config.BackendConfigLoader,
 	ml *model.ModelLoader,
-	appConfig *config.ApplicationConfig,
-	auth func(*fiber.Ctx) error) {
+	appConfig *config.ApplicationConfig) {
 
 	// POST endpoint to mimic the reranking
 	app.Post("/v1/rerank", jina.JINARerankEndpoint(cl, ml, appConfig))
diff --git a/core/http/routes/localai.go b/core/http/routes/localai.go
index f85fa807..29fef378 100644
--- a/core/http/routes/localai.go
+++ b/core/http/routes/localai.go
@@ -15,33 +15,32 @@ func RegisterLocalAIRoutes(app *fiber.App,
 	cl *config.BackendConfigLoader,
 	ml *model.ModelLoader,
 	appConfig *config.ApplicationConfig,
-	galleryService *services.GalleryService,
-	auth func(*fiber.Ctx) error) {
+	galleryService *services.GalleryService) {
 
 	app.Get("/swagger/*", swagger.HandlerDefault) // default
 
 	// LocalAI API endpoints
 	if !appConfig.DisableGalleryEndpoint {
 		modelGalleryEndpointService := localai.CreateModelGalleryEndpointService(appConfig.Galleries, appConfig.ModelPath, galleryService)
-		app.Post("/models/apply", auth, modelGalleryEndpointService.ApplyModelGalleryEndpoint())
-		app.Post("/models/delete/:name", auth, modelGalleryEndpointService.DeleteModelGalleryEndpoint())
+		app.Post("/models/apply", modelGalleryEndpointService.ApplyModelGalleryEndpoint())
+		app.Post("/models/delete/:name", modelGalleryEndpointService.DeleteModelGalleryEndpoint())
 
-		app.Get("/models/available", auth, modelGalleryEndpointService.ListModelFromGalleryEndpoint())
-		app.Get("/models/galleries", auth, modelGalleryEndpointService.ListModelGalleriesEndpoint())
-		app.Post("/models/galleries", auth, modelGalleryEndpointService.AddModelGalleryEndpoint())
-		app.Delete("/models/galleries", auth, modelGalleryEndpointService.RemoveModelGalleryEndpoint())
-		app.Get("/models/jobs/:uuid", auth, modelGalleryEndpointService.GetOpStatusEndpoint())
-		app.Get("/models/jobs", auth, modelGalleryEndpointService.GetAllStatusEndpoint())
+		app.Get("/models/available", modelGalleryEndpointService.ListModelFromGalleryEndpoint())
+		app.Get("/models/galleries", modelGalleryEndpointService.ListModelGalleriesEndpoint())
+		app.Post("/models/galleries", modelGalleryEndpointService.AddModelGalleryEndpoint())
+		app.Delete("/models/galleries", modelGalleryEndpointService.RemoveModelGalleryEndpoint())
+		app.Get("/models/jobs/:uuid", modelGalleryEndpointService.GetOpStatusEndpoint())
+		app.Get("/models/jobs", modelGalleryEndpointService.GetAllStatusEndpoint())
 	}
 
-	app.Post("/tts", auth, localai.TTSEndpoint(cl, ml, appConfig))
+	app.Post("/tts", localai.TTSEndpoint(cl, ml, appConfig))
 
 	// Stores
 	sl := model.NewModelLoader("")
-	app.Post("/stores/set", auth, localai.StoresSetEndpoint(sl, appConfig))
-	app.Post("/stores/delete", auth, localai.StoresDeleteEndpoint(sl, appConfig))
-	app.Post("/stores/get", auth, localai.StoresGetEndpoint(sl, appConfig))
-	app.Post("/stores/find", auth, localai.StoresFindEndpoint(sl, appConfig))
+	app.Post("/stores/set", localai.StoresSetEndpoint(sl, appConfig))
+	app.Post("/stores/delete", localai.StoresDeleteEndpoint(sl, appConfig))
+	app.Post("/stores/get", localai.StoresGetEndpoint(sl, appConfig))
+	app.Post("/stores/find", localai.StoresFindEndpoint(sl, appConfig))
 
 	// Kubernetes health checks
 	ok := func(c *fiber.Ctx) error {
@@ -51,20 +50,20 @@ func RegisterLocalAIRoutes(app *fiber.App,
 	app.Get("/healthz", ok)
 	app.Get("/readyz", ok)
 
-	app.Get("/metrics", auth, localai.LocalAIMetricsEndpoint())
+	app.Get("/metrics", localai.LocalAIMetricsEndpoint())
 
 	// Experimental Backend Statistics Module
 	backendMonitorService := services.NewBackendMonitorService(ml, cl, appConfig) // Split out for now
-	app.Get("/backend/monitor", auth, localai.BackendMonitorEndpoint(backendMonitorService))
-	app.Post("/backend/shutdown", auth, localai.BackendShutdownEndpoint(backendMonitorService))
+	app.Get("/backend/monitor", localai.BackendMonitorEndpoint(backendMonitorService))
+	app.Post("/backend/shutdown", localai.BackendShutdownEndpoint(backendMonitorService))
 
 	// p2p
 	if p2p.IsP2PEnabled() {
-		app.Get("/api/p2p", auth, localai.ShowP2PNodes(appConfig))
-		app.Get("/api/p2p/token", auth, localai.ShowP2PToken(appConfig))
+		app.Get("/api/p2p", localai.ShowP2PNodes(appConfig))
+		app.Get("/api/p2p/token", localai.ShowP2PToken(appConfig))
 	}
 
-	app.Get("/version", auth, func(c *fiber.Ctx) error {
+	app.Get("/version", func(c *fiber.Ctx) error {
 		return c.JSON(struct {
 			Version string `json:"version"`
 		}{Version: internal.PrintableVersion()})
diff --git a/core/http/routes/openai.go b/core/http/routes/openai.go
index e190bc6d..081daf70 100644
--- a/core/http/routes/openai.go
+++ b/core/http/routes/openai.go
@@ -11,66 +11,65 @@ import (
 func RegisterOpenAIRoutes(app *fiber.App,
 	cl *config.BackendConfigLoader,
 	ml *model.ModelLoader,
-	appConfig *config.ApplicationConfig,
-	auth func(*fiber.Ctx) error) {
+	appConfig *config.ApplicationConfig) {
 	// openAI compatible API endpoint
 
 	// chat
-	app.Post("/v1/chat/completions", auth, openai.ChatEndpoint(cl, ml, appConfig))
-	app.Post("/chat/completions", auth, openai.ChatEndpoint(cl, ml, appConfig))
+	app.Post("/v1/chat/completions", openai.ChatEndpoint(cl, ml, appConfig))
+	app.Post("/chat/completions", openai.ChatEndpoint(cl, ml, appConfig))
 
 	// edit
-	app.Post("/v1/edits", auth, openai.EditEndpoint(cl, ml, appConfig))
-	app.Post("/edits", auth, openai.EditEndpoint(cl, ml, appConfig))
+	app.Post("/v1/edits", openai.EditEndpoint(cl, ml, appConfig))
+	app.Post("/edits", openai.EditEndpoint(cl, ml, appConfig))
 
 	// assistant
-	app.Get("/v1/assistants", auth, openai.ListAssistantsEndpoint(cl, ml, appConfig))
-	app.Get("/assistants", auth, openai.ListAssistantsEndpoint(cl, ml, appConfig))
-	app.Post("/v1/assistants", auth, openai.CreateAssistantEndpoint(cl, ml, appConfig))
-	app.Post("/assistants", auth, openai.CreateAssistantEndpoint(cl, ml, appConfig))
-	app.Delete("/v1/assistants/:assistant_id", auth, openai.DeleteAssistantEndpoint(cl, ml, appConfig))
-	app.Delete("/assistants/:assistant_id", auth, openai.DeleteAssistantEndpoint(cl, ml, appConfig))
-	app.Get("/v1/assistants/:assistant_id", auth, openai.GetAssistantEndpoint(cl, ml, appConfig))
-	app.Get("/assistants/:assistant_id", auth, openai.GetAssistantEndpoint(cl, ml, appConfig))
-	app.Post("/v1/assistants/:assistant_id", auth, openai.ModifyAssistantEndpoint(cl, ml, appConfig))
-	app.Post("/assistants/:assistant_id", auth, openai.ModifyAssistantEndpoint(cl, ml, appConfig))
-	app.Get("/v1/assistants/:assistant_id/files", auth, openai.ListAssistantFilesEndpoint(cl, ml, appConfig))
-	app.Get("/assistants/:assistant_id/files", auth, openai.ListAssistantFilesEndpoint(cl, ml, appConfig))
-	app.Post("/v1/assistants/:assistant_id/files", auth, openai.CreateAssistantFileEndpoint(cl, ml, appConfig))
-	app.Post("/assistants/:assistant_id/files", auth, openai.CreateAssistantFileEndpoint(cl, ml, appConfig))
-	app.Delete("/v1/assistants/:assistant_id/files/:file_id", auth, openai.DeleteAssistantFileEndpoint(cl, ml, appConfig))
-	app.Delete("/assistants/:assistant_id/files/:file_id", auth, openai.DeleteAssistantFileEndpoint(cl, ml, appConfig))
-	app.Get("/v1/assistants/:assistant_id/files/:file_id", auth, openai.GetAssistantFileEndpoint(cl, ml, appConfig))
-	app.Get("/assistants/:assistant_id/files/:file_id", auth, openai.GetAssistantFileEndpoint(cl, ml, appConfig))
+	app.Get("/v1/assistants", openai.ListAssistantsEndpoint(cl, ml, appConfig))
+	app.Get("/assistants", openai.ListAssistantsEndpoint(cl, ml, appConfig))
+	app.Post("/v1/assistants", openai.CreateAssistantEndpoint(cl, ml, appConfig))
+	app.Post("/assistants", openai.CreateAssistantEndpoint(cl, ml, appConfig))
+	app.Delete("/v1/assistants/:assistant_id", openai.DeleteAssistantEndpoint(cl, ml, appConfig))
+	app.Delete("/assistants/:assistant_id", openai.DeleteAssistantEndpoint(cl, ml, appConfig))
+	app.Get("/v1/assistants/:assistant_id", openai.GetAssistantEndpoint(cl, ml, appConfig))
+	app.Get("/assistants/:assistant_id", openai.GetAssistantEndpoint(cl, ml, appConfig))
+	app.Post("/v1/assistants/:assistant_id", openai.ModifyAssistantEndpoint(cl, ml, appConfig))
+	app.Post("/assistants/:assistant_id", openai.ModifyAssistantEndpoint(cl, ml, appConfig))
+	app.Get("/v1/assistants/:assistant_id/files", openai.ListAssistantFilesEndpoint(cl, ml, appConfig))
+	app.Get("/assistants/:assistant_id/files", openai.ListAssistantFilesEndpoint(cl, ml, appConfig))
+	app.Post("/v1/assistants/:assistant_id/files", openai.CreateAssistantFileEndpoint(cl, ml, appConfig))
+	app.Post("/assistants/:assistant_id/files", openai.CreateAssistantFileEndpoint(cl, ml, appConfig))
+	app.Delete("/v1/assistants/:assistant_id/files/:file_id", openai.DeleteAssistantFileEndpoint(cl, ml, appConfig))
+	app.Delete("/assistants/:assistant_id/files/:file_id", openai.DeleteAssistantFileEndpoint(cl, ml, appConfig))
+	app.Get("/v1/assistants/:assistant_id/files/:file_id", openai.GetAssistantFileEndpoint(cl, ml, appConfig))
+	app.Get("/assistants/:assistant_id/files/:file_id", openai.GetAssistantFileEndpoint(cl, ml, appConfig))
 
 	// files
-	app.Post("/v1/files", auth, openai.UploadFilesEndpoint(cl, appConfig))
-	app.Post("/files", auth, openai.UploadFilesEndpoint(cl, appConfig))
-	app.Get("/v1/files", auth, openai.ListFilesEndpoint(cl, appConfig))
-	app.Get("/files", auth, openai.ListFilesEndpoint(cl, appConfig))
-	app.Get("/v1/files/:file_id", auth, openai.GetFilesEndpoint(cl, appConfig))
-	app.Get("/files/:file_id", auth, openai.GetFilesEndpoint(cl, appConfig))
-	app.Delete("/v1/files/:file_id", auth, openai.DeleteFilesEndpoint(cl, appConfig))
-	app.Delete("/files/:file_id", auth, openai.DeleteFilesEndpoint(cl, appConfig))
-	app.Get("/v1/files/:file_id/content", auth, openai.GetFilesContentsEndpoint(cl, appConfig))
-	app.Get("/files/:file_id/content", auth, openai.GetFilesContentsEndpoint(cl, appConfig))
+	app.Post("/v1/files", openai.UploadFilesEndpoint(cl, appConfig))
+	app.Post("/files", openai.UploadFilesEndpoint(cl, appConfig))
+	app.Get("/v1/files", openai.ListFilesEndpoint(cl, appConfig))
+	app.Get("/files", openai.ListFilesEndpoint(cl, appConfig))
+	app.Get("/v1/files/:file_id", openai.GetFilesEndpoint(cl, appConfig))
+	app.Get("/files/:file_id", openai.GetFilesEndpoint(cl, appConfig))
+	app.Delete("/v1/files/:file_id", openai.DeleteFilesEndpoint(cl, appConfig))
+	app.Delete("/files/:file_id", openai.DeleteFilesEndpoint(cl, appConfig))
+	app.Get("/v1/files/:file_id/content", openai.GetFilesContentsEndpoint(cl, appConfig))
+	app.Get("/files/:file_id/content", openai.GetFilesContentsEndpoint(cl, appConfig))
 
 	// completion
-	app.Post("/v1/completions", auth, openai.CompletionEndpoint(cl, ml, appConfig))
-	app.Post("/completions", auth, openai.CompletionEndpoint(cl, ml, appConfig))
-	app.Post("/v1/engines/:model/completions", auth, openai.CompletionEndpoint(cl, ml, appConfig))
+	app.Post("/v1/completions", openai.CompletionEndpoint(cl, ml, appConfig))
+	app.Post("/completions", openai.CompletionEndpoint(cl, ml, appConfig))
+	app.Post("/v1/engines/:model/completions", openai.CompletionEndpoint(cl, ml, appConfig))
 
 	// embeddings
-	app.Post("/v1/embeddings", auth, openai.EmbeddingsEndpoint(cl, ml, appConfig))
-	app.Post("/embeddings", auth, openai.EmbeddingsEndpoint(cl, ml, appConfig))
-	app.Post("/v1/engines/:model/embeddings", auth, openai.EmbeddingsEndpoint(cl, ml, appConfig))
+	app.Post("/v1/embeddings", openai.EmbeddingsEndpoint(cl, ml, appConfig))
+	app.Post("/embeddings", openai.EmbeddingsEndpoint(cl, ml, appConfig))
+	app.Post("/v1/engines/:model/embeddings", openai.EmbeddingsEndpoint(cl, ml, appConfig))
 
 	// audio
-	app.Post("/v1/audio/transcriptions", auth, openai.TranscriptEndpoint(cl, ml, appConfig))
-	app.Post("/v1/audio/speech", auth, localai.TTSEndpoint(cl, ml, appConfig))
+	app.Post("/v1/audio/transcriptions", openai.TranscriptEndpoint(cl, ml, appConfig))
+	app.Post("/v1/audio/speech", localai.TTSEndpoint(cl, ml, appConfig))
 
 	// images
-	app.Post("/v1/images/generations", auth, openai.ImageEndpoint(cl, ml, appConfig))
+	app.Post("/v1/images/generations", openai.ImageEndpoint(cl, ml, appConfig))
 
 	if appConfig.ImageDir != "" {
 		app.Static("/generated-images", appConfig.ImageDir)
@@ -81,6 +80,6 @@ func RegisterOpenAIRoutes(app *fiber.App,
 	}
 
 	// List models
-	app.Get("/v1/models", auth, openai.ListModelsEndpoint(cl, ml))
-	app.Get("/models", auth, openai.ListModelsEndpoint(cl, ml))
+	app.Get("/v1/models", openai.ListModelsEndpoint(cl, ml))
+	app.Get("/models", openai.ListModelsEndpoint(cl, ml))
 }
diff --git a/core/http/routes/ui.go b/core/http/routes/ui.go
index 6dfb3f43..7b2c6ae7 100644
--- a/core/http/routes/ui.go
+++ b/core/http/routes/ui.go
@@ -59,8 +59,7 @@ func RegisterUIRoutes(app *fiber.App,
 	cl *config.BackendConfigLoader,
 	ml *model.ModelLoader,
 	appConfig *config.ApplicationConfig,
-	galleryService *services.GalleryService,
-	auth func(*fiber.Ctx) error) {
+	galleryService *services.GalleryService) {
 
 	// keeps the state of models that are being installed from the UI
 	var processingModels = NewModelOpCache()
@@ -85,10 +84,10 @@ func RegisterUIRoutes(app *fiber.App,
 		return processingModelsData, taskTypes
 	}
 
-	app.Get("/", auth, localai.WelcomeEndpoint(appConfig, cl, ml, modelStatus))
+	app.Get("/", localai.WelcomeEndpoint(appConfig, cl, ml, modelStatus))
 
 	if p2p.IsP2PEnabled() {
-		app.Get("/p2p", auth, func(c *fiber.Ctx) error {
+		app.Get("/p2p", func(c *fiber.Ctx) error {
 			summary := fiber.Map{
 				"Title":   "LocalAI - P2P dashboard",
 				"Version": internal.PrintableVersion(),
@@ -104,17 +103,17 @@ func RegisterUIRoutes(app *fiber.App,
 		})
 
 		/* show nodes live! */
-		app.Get("/p2p/ui/workers", auth, func(c *fiber.Ctx) error {
+		app.Get("/p2p/ui/workers", func(c *fiber.Ctx) error {
 			return c.SendString(elements.P2PNodeBoxes(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.WorkerID))))
 		})
-		app.Get("/p2p/ui/workers-federation", auth, func(c *fiber.Ctx) error {
+		app.Get("/p2p/ui/workers-federation", func(c *fiber.Ctx) error {
 			return c.SendString(elements.P2PNodeBoxes(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.FederatedID))))
 		})
 
-		app.Get("/p2p/ui/workers-stats", auth, func(c *fiber.Ctx) error {
+		app.Get("/p2p/ui/workers-stats", func(c *fiber.Ctx) error {
 			return c.SendString(elements.P2PNodeStats(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.WorkerID))))
 		})
-		app.Get("/p2p/ui/workers-federation-stats", auth, func(c *fiber.Ctx) error {
+		app.Get("/p2p/ui/workers-federation-stats", func(c *fiber.Ctx) error {
 			return c.SendString(elements.P2PNodeStats(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.FederatedID))))
 		})
 	}
@@ -122,7 +121,7 @@ func RegisterUIRoutes(app *fiber.App,
 	if !appConfig.DisableGalleryEndpoint {
 
 		// Show the Models page (all models)
-		app.Get("/browse", auth, func(c *fiber.Ctx) error {
+		app.Get("/browse", func(c *fiber.Ctx) error {
 			term := c.Query("term")
 
 			models, _ := gallery.AvailableGalleryModels(appConfig.Galleries, appConfig.ModelPath)
@@ -167,7 +166,7 @@ func RegisterUIRoutes(app *fiber.App,
 
 		// Show the models, filtered from the user input
 		// https://htmx.org/examples/active-search/
-		app.Post("/browse/search/models", auth, func(c *fiber.Ctx) error {
+		app.Post("/browse/search/models", func(c *fiber.Ctx) error {
 			form := struct {
 				Search string `form:"search"`
 			}{}
@@ -188,7 +187,7 @@ func RegisterUIRoutes(app *fiber.App,
 
 		// This route is used when the "Install" button is pressed, we submit here a new job to the gallery service
 		// https://htmx.org/examples/progress-bar/
-		app.Post("/browse/install/model/:id", auth, func(c *fiber.Ctx) error {
+		app.Post("/browse/install/model/:id", func(c *fiber.Ctx) error {
 			galleryID := strings.Clone(c.Params("id")) // note: strings.Clone is required for multiple requests!
 			log.Debug().Msgf("UI job submitted to install  : %+v\n", galleryID)
 
@@ -215,7 +214,7 @@ func RegisterUIRoutes(app *fiber.App,
 
 		// This route is used when the "Install" button is pressed, we submit here a new job to the gallery service
 		// https://htmx.org/examples/progress-bar/
-		app.Post("/browse/delete/model/:id", auth, func(c *fiber.Ctx) error {
+		app.Post("/browse/delete/model/:id", func(c *fiber.Ctx) error {
 			galleryID := strings.Clone(c.Params("id")) // note: strings.Clone is required for multiple requests!
 			log.Debug().Msgf("UI job submitted to delete  : %+v\n", galleryID)
 			var galleryName = galleryID
@@ -255,7 +254,7 @@ func RegisterUIRoutes(app *fiber.App,
 		// Display the job current progress status
 		// If the job is done, we trigger the /browse/job/:uid route
 		// https://htmx.org/examples/progress-bar/
-		app.Get("/browse/job/progress/:uid", auth, func(c *fiber.Ctx) error {
+		app.Get("/browse/job/progress/:uid", func(c *fiber.Ctx) error {
 			jobUID := strings.Clone(c.Params("uid")) // note: strings.Clone is required for multiple requests!
 
 			status := galleryService.GetStatus(jobUID)
@@ -279,7 +278,7 @@ func RegisterUIRoutes(app *fiber.App,
 
 		// this route is hit when the job is done, and we display the
 		// final state (for now just displays "Installation completed")
-		app.Get("/browse/job/:uid", auth, func(c *fiber.Ctx) error {
+		app.Get("/browse/job/:uid", func(c *fiber.Ctx) error {
 			jobUID := strings.Clone(c.Params("uid")) // note: strings.Clone is required for multiple requests!
 
 			status := galleryService.GetStatus(jobUID)
@@ -303,7 +302,7 @@ func RegisterUIRoutes(app *fiber.App,
 	}
 
 	// Show the Chat page
-	app.Get("/chat/:model", auth, func(c *fiber.Ctx) error {
+	app.Get("/chat/:model", func(c *fiber.Ctx) error {
 		backendConfigs, _ := services.ListModels(cl, ml, "", true)
 
 		summary := fiber.Map{
@@ -318,7 +317,7 @@ func RegisterUIRoutes(app *fiber.App,
 		return c.Render("views/chat", summary)
 	})
 
-	app.Get("/talk/", auth, func(c *fiber.Ctx) error {
+	app.Get("/talk/", func(c *fiber.Ctx) error {
 		backendConfigs, _ := services.ListModels(cl, ml, "", true)
 
 		if len(backendConfigs) == 0 {
@@ -338,7 +337,7 @@ func RegisterUIRoutes(app *fiber.App,
 		return c.Render("views/talk", summary)
 	})
 
-	app.Get("/chat/", auth, func(c *fiber.Ctx) error {
+	app.Get("/chat/", func(c *fiber.Ctx) error {
 
 		backendConfigs, _ := services.ListModels(cl, ml, "", true)
 
@@ -359,7 +358,7 @@ func RegisterUIRoutes(app *fiber.App,
 		return c.Render("views/chat", summary)
 	})
 
-	app.Get("/text2image/:model", auth, func(c *fiber.Ctx) error {
+	app.Get("/text2image/:model", func(c *fiber.Ctx) error {
 		backendConfigs := cl.GetAllBackendConfigs()
 
 		summary := fiber.Map{
@@ -374,7 +373,7 @@ func RegisterUIRoutes(app *fiber.App,
 		return c.Render("views/text2image", summary)
 	})
 
-	app.Get("/text2image/", auth, func(c *fiber.Ctx) error {
+	app.Get("/text2image/", func(c *fiber.Ctx) error {
 
 		backendConfigs := cl.GetAllBackendConfigs()
 
@@ -395,7 +394,7 @@ func RegisterUIRoutes(app *fiber.App,
 		return c.Render("views/text2image", summary)
 	})
 
-	app.Get("/tts/:model", auth, func(c *fiber.Ctx) error {
+	app.Get("/tts/:model", func(c *fiber.Ctx) error {
 		backendConfigs := cl.GetAllBackendConfigs()
 
 		summary := fiber.Map{
@@ -410,7 +409,7 @@ func RegisterUIRoutes(app *fiber.App,
 		return c.Render("views/tts", summary)
 	})
 
-	app.Get("/tts/", auth, func(c *fiber.Ctx) error {
+	app.Get("/tts/", func(c *fiber.Ctx) error {
 
 		backendConfigs := cl.GetAllBackendConfigs()
 
diff --git a/go.mod b/go.mod
index 57202ad2..a3359abf 100644
--- a/go.mod
+++ b/go.mod
@@ -74,6 +74,7 @@ require (
 	cloud.google.com/go/auth/oauth2adapt v0.2.2 // indirect
 	cloud.google.com/go/compute/metadata v0.3.0 // indirect
 	github.com/cpuguy83/go-md2man/v2 v2.0.4 // indirect
+	github.com/dave-gray101/v2keyauth v0.0.0-20240624150259-c45d584d25e2 // indirect
 	github.com/envoyproxy/protoc-gen-validate v1.0.4 // indirect
 	github.com/felixge/httpsnoop v1.0.4 // indirect
 	github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
diff --git a/go.sum b/go.sum
index ab64b84a..1dd44a5b 100644
--- a/go.sum
+++ b/go.sum
@@ -110,6 +110,8 @@ github.com/creachadair/otp v0.4.2 h1:ngNMaD6Tzd7UUNRFyed7ykZFn/Wr5sSs5ffqZWm9pu8
 github.com/creachadair/otp v0.4.2/go.mod h1:DqV9hJyUbcUme0pooYfiFvvMe72Aua5sfhNzwfZvk40=
 github.com/creack/pty v1.1.18 h1:n56/Zwd5o6whRC5PMGretI4IdRLlmBXYNjScPaBgsbY=
 github.com/creack/pty v1.1.18/go.mod h1:MOBLtS5ELjhRRrroQr9kyvTxUAFNvYEK993ew/Vr4O4=
+github.com/dave-gray101/v2keyauth v0.0.0-20240624150259-c45d584d25e2 h1:flLYmnQFZNo04x2NPehMbf30m7Pli57xwZ0NFqR/hb0=
+github.com/dave-gray101/v2keyauth v0.0.0-20240624150259-c45d584d25e2/go.mod h1:NtWqRzAp/1tw+twkW8uuBenEVVYndEAZACWU3F3xdoQ=
 github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
 github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
 github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=

From e95cb8eaacdac6426c085197ec5acf790206c042 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 03:33:52 +0000
Subject: [PATCH 018/122] chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in
 /backend/python/transformers (#3579)

chore(deps): Bump setuptools in /backend/python/transformers

Bumps [setuptools](https://github.com/pypa/setuptools) from 69.5.1 to 75.1.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v69.5.1...v75.1.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/transformers/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt
index b19c59c0..1b7ebda5 100644
--- a/backend/python/transformers/requirements.txt
+++ b/backend/python/transformers/requirements.txt
@@ -1,4 +1,4 @@
 grpcio==1.66.1
 protobuf
 certifi
-setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file
+setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file

From f4b1bd8f6d70365e99320e52119cb7ed577b63c9 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 03:41:01 +0000
Subject: [PATCH 019/122] chore(deps): Bump setuptools from 70.3.0 to 75.1.0 in
 /backend/python/vllm (#3580)

chore(deps): Bump setuptools in /backend/python/vllm

Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 75.1.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v75.1.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/vllm/requirements-intel.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/vllm/requirements-intel.txt b/backend/python/vllm/requirements-intel.txt
index 7903282e..1f82c46e 100644
--- a/backend/python/vllm/requirements-intel.txt
+++ b/backend/python/vllm/requirements-intel.txt
@@ -4,4 +4,4 @@ accelerate
 torch
 transformers
 optimum[openvino]
-setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file
+setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file

From 0e4e101101e92cd6b2451cf71a2f85a880468183 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Tue, 17 Sep 2024 05:52:15 +0200
Subject: [PATCH 020/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `23e0d70bacaaca1429d365a44aa9e7434f17823b` (#3581)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index e4d5b22c..f9fa5476 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=6262d13e0b2da91f230129a93a996609a2f5a2f2
+CPPLLAMA_VERSION?=23e0d70bacaaca1429d365a44aa9e7434f17823b
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From d0f2bf318103f631686c648d6bb6a299bca15976 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Tue, 17 Sep 2024 06:50:57 +0200
Subject: [PATCH 021/122] fix(shutdown): do not shutdown immediately busy
 backends (#3543)

* fix(shutdown): do not shutdown immediately busy backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(refactor): avoid duplicate functions

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: multiplicative backoff for shutdown (#3547)

* multiplicative backoff for shutdown

Rather than always retry every two seconds, back off the shutdown attempt rate?

Signed-off-by: Dave <dave@gray101.com>

* Update loader.go

Signed-off-by: Dave <dave@gray101.com>

* add clamp of 2 minutes

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave <dave@gray101.com>
Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Dave <dave@gray101.com>
Signed-off-by: Dave Lee <dave@gray101.com>
Co-authored-by: Dave <dave@gray101.com>
---
 pkg/model/loader.go  | 24 +++++++++++++++++-------
 pkg/model/process.go | 17 +++++++++--------
 2 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/pkg/model/loader.go b/pkg/model/loader.go
index 90fda35f..b9865f73 100644
--- a/pkg/model/loader.go
+++ b/pkg/model/loader.go
@@ -69,6 +69,8 @@ var knownModelsNameSuffixToSkip []string = []string{
 	".tar.gz",
 }
 
+const retryTimeout = time.Duration(2 * time.Minute)
+
 func (ml *ModelLoader) ListFilesInModelPath() ([]string, error) {
 	files, err := os.ReadDir(ml.ModelPath)
 	if err != nil {
@@ -146,15 +148,23 @@ func (ml *ModelLoader) ShutdownModel(modelName string) error {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
 
-	return ml.stopModel(modelName)
-}
-
-func (ml *ModelLoader) stopModel(modelName string) error {
-	defer ml.deleteProcess(modelName)
-	if _, ok := ml.models[modelName]; !ok {
+	_, ok := ml.models[modelName]
+	if !ok {
 		return fmt.Errorf("model %s not found", modelName)
 	}
-	return nil
+
+	retries := 1
+	for ml.models[modelName].GRPC(false, ml.wd).IsBusy() {
+		log.Debug().Msgf("%s busy. Waiting.", modelName)
+		dur := time.Duration(retries*2) * time.Second
+		if dur > retryTimeout {
+			dur = retryTimeout
+		}
+		time.Sleep(dur)
+		retries++
+	}
+
+	return ml.deleteProcess(modelName)
 }
 
 func (ml *ModelLoader) CheckIsLoaded(s string) *Model {
diff --git a/pkg/model/process.go b/pkg/model/process.go
index 5b751de8..50afbb1c 100644
--- a/pkg/model/process.go
+++ b/pkg/model/process.go
@@ -18,15 +18,16 @@ import (
 
 func (ml *ModelLoader) StopAllExcept(s string) error {
 	return ml.StopGRPC(func(id string, p *process.Process) bool {
-		if id != s {
-			for ml.models[id].GRPC(false, ml.wd).IsBusy() {
-				log.Debug().Msgf("%s busy. Waiting.", id)
-				time.Sleep(2 * time.Second)
-			}
-			log.Debug().Msgf("[single-backend] Stopping %s", id)
-			return true
+		if id == s {
+			return false
 		}
-		return false
+
+		for ml.models[id].GRPC(false, ml.wd).IsBusy() {
+			log.Debug().Msgf("%s busy. Waiting.", id)
+			time.Sleep(2 * time.Second)
+		}
+		log.Debug().Msgf("[single-backend] Stopping %s", id)
+		return true
 	})
 }
 

From 22247ad92c65818d6fb751a2f9998b565190db7f Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 05:50:31 +0000
Subject: [PATCH 022/122] chore(deps): Bump langchain from 0.2.16 to 0.3.0 in
 /examples/langchain-chroma (#3557)

chore(deps): Bump langchain in /examples/langchain-chroma

Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.16 to 0.3.0.
- [Release notes](https://github.com/langchain-ai/langchain/releases)
- [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.16...langchain==0.3.0)

---
updated-dependencies:
- dependency-name: langchain
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain-chroma/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt
index 3edb570c..4884d4aa 100644
--- a/examples/langchain-chroma/requirements.txt
+++ b/examples/langchain-chroma/requirements.txt
@@ -1,4 +1,4 @@
-langchain==0.2.16
+langchain==0.3.0
 openai==1.45.1
 chromadb==0.5.5
 llama-index==0.11.7
\ No newline at end of file

From 4a4e44bf5559f2eac49df2c1135f39ad6d70300f Mon Sep 17 00:00:00 2001
From: Alexander Izotov <93216976+Nyralei@users.noreply.github.com>
Date: Tue, 17 Sep 2024 08:52:37 +0300
Subject: [PATCH 023/122] feat: allow setting trust_remote_code for
 sentencetransformers backend (#3552)

Allow setting trust_remote_code for SentenceTransformers backend

Signed-off-by: Nyralei <93216976+Nyralei@users.noreply.github.com>
---
 backend/python/sentencetransformers/backend.py       | 2 +-
 backend/python/sentencetransformers/requirements.txt | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/backend/python/sentencetransformers/backend.py b/backend/python/sentencetransformers/backend.py
index 905015e1..2a20bf60 100755
--- a/backend/python/sentencetransformers/backend.py
+++ b/backend/python/sentencetransformers/backend.py
@@ -55,7 +55,7 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
         """
         model_name = request.Model
         try:
-            self.model = SentenceTransformer(model_name)
+            self.model = SentenceTransformer(model_name, trust_remote_code=request.TrustRemoteCode)
         except Exception as err:
             return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}")
 
diff --git a/backend/python/sentencetransformers/requirements.txt b/backend/python/sentencetransformers/requirements.txt
index 8e1b0195..b9cb6061 100644
--- a/backend/python/sentencetransformers/requirements.txt
+++ b/backend/python/sentencetransformers/requirements.txt
@@ -1,3 +1,5 @@
 grpcio==1.66.1
 protobuf
-certifi
\ No newline at end of file
+certifi
+datasets
+einops
\ No newline at end of file

From 46fd4ff6db3aedaec3579872aa35d47973417b0a Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 06:19:52 +0000
Subject: [PATCH 024/122] chore(deps): Bump openai from 1.44.0 to 1.45.1 in
 /examples/functions (#3560)

Bumps [openai](https://github.com/openai/openai-python) from 1.44.0 to 1.45.1.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.44.0...v1.45.1)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/functions/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt
index 9dd6818f..670090d3 100644
--- a/examples/functions/requirements.txt
+++ b/examples/functions/requirements.txt
@@ -1,2 +1,2 @@
 langchain==0.3.0
-openai==1.44.0
+openai==1.45.1

From 075e5015c0ff0ca1010d5bba11a774c1564a8795 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Tue, 17 Sep 2024 09:06:07 +0200
Subject: [PATCH 025/122] Revert "chore(deps): Bump setuptools from 69.5.1 to
 75.1.0 in /backend/python/transformers" (#3586)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Revert "chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in /backend/python…"

This reverts commit e95cb8eaacdac6426c085197ec5acf790206c042.
---
 backend/python/transformers/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt
index 1b7ebda5..b19c59c0 100644
--- a/backend/python/transformers/requirements.txt
+++ b/backend/python/transformers/requirements.txt
@@ -1,4 +1,4 @@
 grpcio==1.66.1
 protobuf
 certifi
-setuptools==75.1.0 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file
+setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file

From 92136a5d342993bdc8e0a26d5498b2e65ce9d26e Mon Sep 17 00:00:00 2001
From: Dave <dave@gray101.com>
Date: Tue, 17 Sep 2024 03:23:58 -0400
Subject: [PATCH 026/122] fix: `gallery/index.yaml` comment spacing (#3585)

extremely minor fix: add a space to index.yaml for the scanner

Signed-off-by: Dave Lee <dave@gray101.com>
---
 gallery/index.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 5e47d31c..229697bb 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -1281,7 +1281,7 @@
 - !!merge <<: *mistral03
   name: "mn-12b-lyra-v4-iq-imatrix"
   icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/dVoru83WOpwVjMlgZ_xhA.png
-  #chatml
+  # chatml
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
   urls:
     - https://huggingface.co/Lewdiculous/MN-12B-Lyra-v4-GGUF-IQ-Imatrix

From 504962938127a04590e2e2383b2d5933ef3b48fd Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 10:24:01 +0200
Subject: [PATCH 027/122] chore(deps): Bump langchain from 0.2.16 to 0.3.0 in
 /examples/langchain/langchainpy-localai-example (#3577)

chore(deps): Bump langchain

Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.16 to 0.3.0.
- [Release notes](https://github.com/langchain-ai/langchain/releases)
- [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.16...langchain==0.3.0)

---
updated-dependencies:
- dependency-name: langchain
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index 1bd6b841..213b4e2f 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -10,7 +10,7 @@ debugpy==1.8.2
 frozenlist==1.4.1
 greenlet==3.1.0
 idna==3.8
-langchain==0.2.16
+langchain==0.3.0
 langchain-community==0.2.16
 marshmallow==3.22.0
 marshmallow-enum==1.5.1

From 8826ca93b3b23d2d9333856b136bc606e92710ae Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 10:24:14 +0200
Subject: [PATCH 028/122] chore(deps): Bump openai from 1.44.0 to 1.45.1 in
 /examples/langchain/langchainpy-localai-example (#3573)

chore(deps): Bump openai

Bumps [openai](https://github.com/openai/openai-python) from 1.44.0 to 1.45.1.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.44.0...v1.45.1)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index 213b4e2f..98325db3 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -18,7 +18,7 @@ multidict==6.0.5
 mypy-extensions==1.0.0
 numexpr==2.10.1
 numpy==2.1.1
-openai==1.44.0
+openai==1.45.1
 openapi-schema-pydantic==1.2.4
 packaging>=23.2
 pydantic==2.8.2

From eee1fb2c75171fc4a236bf224eda5c0df3d1fa3f Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 17 Sep 2024 10:24:34 +0200
Subject: [PATCH 029/122] chore(deps): Bump pypinyin from 0.50.0 to 0.53.0 in
 /backend/python/openvoice (#3562)

chore(deps): Bump pypinyin in /backend/python/openvoice

Bumps [pypinyin](https://github.com/mozillazg/python-pinyin) from 0.50.0 to 0.53.0.
- [Release notes](https://github.com/mozillazg/python-pinyin/releases)
- [Changelog](https://github.com/mozillazg/python-pinyin/blob/master/CHANGELOG.rst)
- [Commits](https://github.com/mozillazg/python-pinyin/compare/v0.50.0...v0.53.0)

---
updated-dependencies:
- dependency-name: pypinyin
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/openvoice/requirements-intel.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/openvoice/requirements-intel.txt b/backend/python/openvoice/requirements-intel.txt
index a9a4cc20..cea7de0b 100644
--- a/backend/python/openvoice/requirements-intel.txt
+++ b/backend/python/openvoice/requirements-intel.txt
@@ -15,7 +15,7 @@ unidecode==1.3.7
 whisper-timestamped==1.15.4
 openai
 python-dotenv
-pypinyin==0.50.0
+pypinyin==0.53.0
 cn2an==0.5.22
 jieba==0.42.1
 gradio==4.38.1

From a53392f91953bf53c77041a8cd25282cd65eb71a Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Tue, 17 Sep 2024 16:51:40 +0200
Subject: [PATCH 030/122] chore(refactor): drop duplicated shutdown logics
 (#3589)

* chore(refactor): drop duplicated shutdown logics

- Handle locking in Shutdown and CheckModelIsLoaded in a more go-idiomatic way
- Drop duplicated code and re-organize shutdown code

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: drop leftover

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: improve logging and add missing locks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 core/http/routes/localai.go |  2 +-
 pkg/model/filters.go        | 17 +++++++++++++++++
 pkg/model/initializers.go   | 16 ++++++----------
 pkg/model/loader.go         |  7 ++++---
 pkg/model/process.go        | 28 ++++------------------------
 5 files changed, 32 insertions(+), 38 deletions(-)
 create mode 100644 pkg/model/filters.go

diff --git a/core/http/routes/localai.go b/core/http/routes/localai.go
index 29fef378..247596c0 100644
--- a/core/http/routes/localai.go
+++ b/core/http/routes/localai.go
@@ -69,6 +69,6 @@ func RegisterLocalAIRoutes(app *fiber.App,
 		}{Version: internal.PrintableVersion()})
 	})
 
-	app.Get("/system", auth, localai.SystemInformations(ml, appConfig))
+	app.Get("/system", localai.SystemInformations(ml, appConfig))
 
 }
diff --git a/pkg/model/filters.go b/pkg/model/filters.go
new file mode 100644
index 00000000..79b72d5b
--- /dev/null
+++ b/pkg/model/filters.go
@@ -0,0 +1,17 @@
+package model
+
+import (
+	process "github.com/mudler/go-processmanager"
+)
+
+type GRPCProcessFilter = func(id string, p *process.Process) bool
+
+func all(_ string, _ *process.Process) bool {
+	return true
+}
+
+func allExcept(s string) GRPCProcessFilter {
+	return func(id string, p *process.Process) bool {
+		return id != s
+	}
+}
diff --git a/pkg/model/initializers.go b/pkg/model/initializers.go
index 3d2255cc..7099bf33 100644
--- a/pkg/model/initializers.go
+++ b/pkg/model/initializers.go
@@ -320,7 +320,7 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string
 		} else {
 			grpcProcess := backendPath(o.assetDir, backend)
 			if err := utils.VerifyPath(grpcProcess, o.assetDir); err != nil {
-				return nil, fmt.Errorf("grpc process not found in assetdir: %s", err.Error())
+				return nil, fmt.Errorf("refering to a backend not in asset dir: %s", err.Error())
 			}
 
 			if autoDetect {
@@ -332,7 +332,7 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string
 
 			// Check if the file exists
 			if _, err := os.Stat(grpcProcess); os.IsNotExist(err) {
-				return nil, fmt.Errorf("grpc process not found: %s. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS", grpcProcess)
+				return nil, fmt.Errorf("backend not found: %s", grpcProcess)
 			}
 
 			serverAddress, err := getFreeAddress()
@@ -355,6 +355,8 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string
 			client = NewModel(serverAddress)
 		}
 
+		log.Debug().Msgf("Wait for the service to start up")
+
 		// Wait for the service to start up
 		ready := false
 		for i := 0; i < o.grpcAttempts; i++ {
@@ -413,10 +415,8 @@ func (ml *ModelLoader) BackendLoader(opts ...Option) (client grpc.Backend, err e
 	}
 
 	if o.singleActiveBackend {
-		ml.mu.Lock()
 		log.Debug().Msgf("Stopping all backends except '%s'", o.model)
-		err := ml.StopAllExcept(o.model)
-		ml.mu.Unlock()
+		err := ml.StopGRPC(allExcept(o.model))
 		if err != nil {
 			log.Error().Err(err).Str("keptModel", o.model).Msg("error while shutting down all backends except for the keptModel")
 			return nil, err
@@ -444,13 +444,10 @@ func (ml *ModelLoader) BackendLoader(opts ...Option) (client grpc.Backend, err e
 func (ml *ModelLoader) GreedyLoader(opts ...Option) (grpc.Backend, error) {
 	o := NewOptions(opts...)
 
-	ml.mu.Lock()
-
 	// Return earlier if we have a model already loaded
 	// (avoid looping through all the backends)
 	if m := ml.CheckIsLoaded(o.model); m != nil {
 		log.Debug().Msgf("Model '%s' already loaded", o.model)
-		ml.mu.Unlock()
 
 		return m.GRPC(o.parallelRequests, ml.wd), nil
 	}
@@ -458,12 +455,11 @@ func (ml *ModelLoader) GreedyLoader(opts ...Option) (grpc.Backend, error) {
 	// If we can have only one backend active, kill all the others (except external backends)
 	if o.singleActiveBackend {
 		log.Debug().Msgf("Stopping all backends except '%s'", o.model)
-		err := ml.StopAllExcept(o.model)
+		err := ml.StopGRPC(allExcept(o.model))
 		if err != nil {
 			log.Error().Err(err).Str("keptModel", o.model).Msg("error while shutting down all backends except for the keptModel - greedyloader continuing")
 		}
 	}
-	ml.mu.Unlock()
 
 	var err error
 
diff --git a/pkg/model/loader.go b/pkg/model/loader.go
index b9865f73..f70d2cea 100644
--- a/pkg/model/loader.go
+++ b/pkg/model/loader.go
@@ -118,9 +118,6 @@ func (ml *ModelLoader) ListModels() []*Model {
 }
 
 func (ml *ModelLoader) LoadModel(modelName string, loader func(string, string) (*Model, error)) (*Model, error) {
-	ml.mu.Lock()
-	defer ml.mu.Unlock()
-
 	// Check if we already have a loaded model
 	if model := ml.CheckIsLoaded(modelName); model != nil {
 		return model, nil
@@ -139,6 +136,8 @@ func (ml *ModelLoader) LoadModel(modelName string, loader func(string, string) (
 		return nil, fmt.Errorf("loader didn't return a model")
 	}
 
+	ml.mu.Lock()
+	defer ml.mu.Unlock()
 	ml.models[modelName] = model
 
 	return model, nil
@@ -168,6 +167,8 @@ func (ml *ModelLoader) ShutdownModel(modelName string) error {
 }
 
 func (ml *ModelLoader) CheckIsLoaded(s string) *Model {
+	ml.mu.Lock()
+	defer ml.mu.Unlock()
 	m, ok := ml.models[s]
 	if !ok {
 		return nil
diff --git a/pkg/model/process.go b/pkg/model/process.go
index 50afbb1c..bcd1fccb 100644
--- a/pkg/model/process.go
+++ b/pkg/model/process.go
@@ -9,28 +9,12 @@ import (
 	"strconv"
 	"strings"
 	"syscall"
-	"time"
 
 	"github.com/hpcloud/tail"
 	process "github.com/mudler/go-processmanager"
 	"github.com/rs/zerolog/log"
 )
 
-func (ml *ModelLoader) StopAllExcept(s string) error {
-	return ml.StopGRPC(func(id string, p *process.Process) bool {
-		if id == s {
-			return false
-		}
-
-		for ml.models[id].GRPC(false, ml.wd).IsBusy() {
-			log.Debug().Msgf("%s busy. Waiting.", id)
-			time.Sleep(2 * time.Second)
-		}
-		log.Debug().Msgf("[single-backend] Stopping %s", id)
-		return true
-	})
-}
-
 func (ml *ModelLoader) deleteProcess(s string) error {
 	if _, exists := ml.grpcProcesses[s]; exists {
 		if err := ml.grpcProcesses[s].Stop(); err != nil {
@@ -42,17 +26,11 @@ func (ml *ModelLoader) deleteProcess(s string) error {
 	return nil
 }
 
-type GRPCProcessFilter = func(id string, p *process.Process) bool
-
-func includeAllProcesses(_ string, _ *process.Process) bool {
-	return true
-}
-
 func (ml *ModelLoader) StopGRPC(filter GRPCProcessFilter) error {
 	var err error = nil
 	for k, p := range ml.grpcProcesses {
 		if filter(k, p) {
-			e := ml.deleteProcess(k)
+			e := ml.ShutdownModel(k)
 			err = errors.Join(err, e)
 		}
 	}
@@ -60,10 +38,12 @@ func (ml *ModelLoader) StopGRPC(filter GRPCProcessFilter) error {
 }
 
 func (ml *ModelLoader) StopAllGRPC() error {
-	return ml.StopGRPC(includeAllProcesses)
+	return ml.StopGRPC(all)
 }
 
 func (ml *ModelLoader) GetGRPCPID(id string) (int, error) {
+	ml.mu.Lock()
+	defer ml.mu.Unlock()
 	p, exists := ml.grpcProcesses[id]
 	if !exists {
 		return -1, fmt.Errorf("no grpc backend found for %s", id)

From acf119828f940083451f8faa3095a5d3804ebd78 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Tue, 17 Sep 2024 17:22:56 +0200
Subject: [PATCH 031/122] Revert "chore(deps): Bump securego/gosec from 2.21.0
 to 2.21.2" (#3590)

Revert "chore(deps): Bump securego/gosec from 2.21.0 to 2.21.2 (#3561)"

This reverts commit 12a8d0e46fbd03f8d550dc41ea6325d07d66cd00.
---
 .github/workflows/secscan.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/secscan.yaml b/.github/workflows/secscan.yaml
index 08d7dfc6..db9db586 100644
--- a/.github/workflows/secscan.yaml
+++ b/.github/workflows/secscan.yaml
@@ -18,7 +18,7 @@ jobs:
         if: ${{ github.actor != 'dependabot[bot]' }}
       - name: Run Gosec Security Scanner
         if: ${{ github.actor != 'dependabot[bot]' }}
-        uses: securego/gosec@v2.21.2
+        uses: securego/gosec@v2.21.0
         with:
           # we let the report trigger content trigger a failure using the GitHub Security features.
           args: '-no-fail -fmt sarif -out results.sarif ./...'

From dc98b2ea4474c62fbf834b421663239d6b93f534 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Tue, 17 Sep 2024 23:51:41 +0200
Subject: [PATCH 032/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `8b836ae731bbb2c5640bc47df5b0a78ffcb129cb` (#3591)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index f9fa5476..4493404e 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=23e0d70bacaaca1429d365a44aa9e7434f17823b
+CPPLLAMA_VERSION?=8b836ae731bbb2c5640bc47df5b0a78ffcb129cb
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From e5bd74878e79b2dd819c58d9811f9573bb3c9594 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Wed, 18 Sep 2024 00:02:02 +0200
Subject: [PATCH 033/122] chore: :arrow_up: Update ggerganov/whisper.cpp to
 `5b1ce40fa882e9cb8630b48032067a1ed2f1534f` (#3592)

:arrow_up: Update ggerganov/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 4493404e..54ae7b73 100644
--- a/Makefile
+++ b/Makefile
@@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6
 
 # whisper.cpp version
 WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp
-WHISPER_CPP_VERSION?=049b3a0e53c8a8e4c4576c06a1a4fccf0063a73f
+WHISPER_CPP_VERSION?=5b1ce40fa882e9cb8630b48032067a1ed2f1534f
 
 # bert.cpp version
 BERT_REPO?=https://github.com/go-skynet/go-bert.cpp

From a50cde69a258405ad765d3f6adf6a03aaaa6776a Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Wed, 18 Sep 2024 15:55:46 +0200
Subject: [PATCH 034/122] chore(aio): rename gpt-4-vision-preview to gpt-4o
 (#3597)

Fixes: 3596

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 aio/cpu/vision.yaml       | 2 +-
 aio/gpu-8g/vision.yaml    | 2 +-
 aio/intel/vision.yaml     | 2 +-
 tests/e2e-aio/e2e_test.go | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/aio/cpu/vision.yaml b/aio/cpu/vision.yaml
index 3b466d37..4052fa39 100644
--- a/aio/cpu/vision.yaml
+++ b/aio/cpu/vision.yaml
@@ -2,7 +2,7 @@ backend: llama-cpp
 context_size: 4096
 f16: true
 mmap: true
-name: gpt-4-vision-preview
+name: gpt-4o
 
 roles:
   user: "USER:"
diff --git a/aio/gpu-8g/vision.yaml b/aio/gpu-8g/vision.yaml
index db039279..4f5e10b3 100644
--- a/aio/gpu-8g/vision.yaml
+++ b/aio/gpu-8g/vision.yaml
@@ -2,7 +2,7 @@ backend: llama-cpp
 context_size: 4096
 f16: true
 mmap: true
-name: gpt-4-vision-preview
+name: gpt-4o
 
 roles:
   user: "USER:"
diff --git a/aio/intel/vision.yaml b/aio/intel/vision.yaml
index 52843162..37067362 100644
--- a/aio/intel/vision.yaml
+++ b/aio/intel/vision.yaml
@@ -2,7 +2,7 @@ backend: llama-cpp
 context_size: 4096
 mmap: false
 f16: false
-name: gpt-4-vision-preview
+name: gpt-4o
 
 roles:
   user: "USER:"
diff --git a/tests/e2e-aio/e2e_test.go b/tests/e2e-aio/e2e_test.go
index f3f7b106..36d127d2 100644
--- a/tests/e2e-aio/e2e_test.go
+++ b/tests/e2e-aio/e2e_test.go
@@ -171,7 +171,7 @@ var _ = Describe("E2E test", func() {
 		})
 		Context("vision", func() {
 			It("correctly", func() {
-				model := "gpt-4-vision-preview"
+				model := "gpt-4o"
 				resp, err := client.CreateChatCompletion(context.TODO(),
 					openai.ChatCompletionRequest{
 						Model: model, Messages: []openai.ChatCompletionMessage{

From c6a819e92fc7e687f6fe9c8a29f5b56b62820163 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Wed, 18 Sep 2024 23:41:59 +0200
Subject: [PATCH 035/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `64c6af3195c3cd4aa3328a1282d29cd2635c34c9` (#3598)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 54ae7b73..286f4b5a 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=8b836ae731bbb2c5640bc47df5b0a78ffcb129cb
+CPPLLAMA_VERSION?=64c6af3195c3cd4aa3328a1282d29cd2635c34c9
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From fbb9facda40eb9442ef0819b5a2de13500019229 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Thu, 19 Sep 2024 11:21:59 +0200
Subject: [PATCH 036/122] feat(api): allow to pass videos to backends (#3601)

This prepares the API to receive videos as well for video understanding.

It works similarly to images, where the request should be in the form:

{
 "type": "video_url",
 "video_url": { "url": "url or base64 data" }
}

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 backend/backend.proto                   |  1 +
 core/backend/llm.go                     |  3 +-
 core/http/endpoints/openai/chat.go      |  6 +++-
 core/http/endpoints/openai/inference.go |  6 +++-
 core/http/endpoints/openai/request.go   | 38 +++++++++++++++++--------
 core/schema/openai.go                   |  2 ++
 pkg/utils/base64.go                     | 10 ++-----
 pkg/utils/base64_test.go                |  8 +++---
 8 files changed, 47 insertions(+), 27 deletions(-)

diff --git a/backend/backend.proto b/backend/backend.proto
index 4a8f31a9..6ef83567 100644
--- a/backend/backend.proto
+++ b/backend/backend.proto
@@ -134,6 +134,7 @@ message PredictOptions {
   repeated string Images = 42;
   bool UseTokenizerTemplate = 43;
   repeated Message Messages = 44;
+  repeated string Videos = 45;
 }
 
 // The response message containing the result
diff --git a/core/backend/llm.go b/core/backend/llm.go
index 2b4564a8..fa4c0709 100644
--- a/core/backend/llm.go
+++ b/core/backend/llm.go
@@ -31,7 +31,7 @@ type TokenUsage struct {
 	Completion int
 }
 
-func ModelInference(ctx context.Context, s string, messages []schema.Message, images []string, loader *model.ModelLoader, c config.BackendConfig, o *config.ApplicationConfig, tokenCallback func(string, TokenUsage) bool) (func() (LLMResponse, error), error) {
+func ModelInference(ctx context.Context, s string, messages []schema.Message, images, videos []string, loader *model.ModelLoader, c config.BackendConfig, o *config.ApplicationConfig, tokenCallback func(string, TokenUsage) bool) (func() (LLMResponse, error), error) {
 	modelFile := c.Model
 	threads := c.Threads
 	if *threads == 0 && o.Threads != 0 {
@@ -101,6 +101,7 @@ func ModelInference(ctx context.Context, s string, messages []schema.Message, im
 		opts.Messages = protoMessages
 		opts.UseTokenizerTemplate = c.TemplateConfig.UseTokenizerTemplate
 		opts.Images = images
+		opts.Videos = videos
 
 		tokenUsage := TokenUsage{}
 
diff --git a/core/http/endpoints/openai/chat.go b/core/http/endpoints/openai/chat.go
index 8144bdcd..742a4add 100644
--- a/core/http/endpoints/openai/chat.go
+++ b/core/http/endpoints/openai/chat.go
@@ -640,8 +640,12 @@ func handleQuestion(config *config.BackendConfig, input *schema.OpenAIRequest, m
 	for _, m := range input.Messages {
 		images = append(images, m.StringImages...)
 	}
+	videos := []string{}
+	for _, m := range input.Messages {
+		videos = append(videos, m.StringVideos...)
+	}
 
-	predFunc, err := backend.ModelInference(input.Context, prompt, input.Messages, images, ml, *config, o, nil)
+	predFunc, err := backend.ModelInference(input.Context, prompt, input.Messages, images, videos, ml, *config, o, nil)
 	if err != nil {
 		log.Error().Err(err).Msg("model inference failed")
 		return "", err
diff --git a/core/http/endpoints/openai/inference.go b/core/http/endpoints/openai/inference.go
index 4950ce20..4008ba3d 100644
--- a/core/http/endpoints/openai/inference.go
+++ b/core/http/endpoints/openai/inference.go
@@ -27,9 +27,13 @@ func ComputeChoices(
 	for _, m := range req.Messages {
 		images = append(images, m.StringImages...)
 	}
+	videos := []string{}
+	for _, m := range req.Messages {
+		videos = append(videos, m.StringVideos...)
+	}
 
 	// get the model function to call for the result
-	predFunc, err := backend.ModelInference(req.Context, predInput, req.Messages, images, loader, *config, o, tokenCallback)
+	predFunc, err := backend.ModelInference(req.Context, predInput, req.Messages, images, videos, loader, *config, o, tokenCallback)
 	if err != nil {
 		return result, backend.TokenUsage{}, err
 	}
diff --git a/core/http/endpoints/openai/request.go b/core/http/endpoints/openai/request.go
index a99ebea2..456a1e0c 100644
--- a/core/http/endpoints/openai/request.go
+++ b/core/http/endpoints/openai/request.go
@@ -135,7 +135,7 @@ func updateRequestConfig(config *config.BackendConfig, input *schema.OpenAIReque
 	}
 
 	// Decode each request's message content
-	index := 0
+	imgIndex, vidIndex := 0, 0
 	for i, m := range input.Messages {
 		switch content := m.Content.(type) {
 		case string:
@@ -144,20 +144,34 @@ func updateRequestConfig(config *config.BackendConfig, input *schema.OpenAIReque
 			dat, _ := json.Marshal(content)
 			c := []schema.Content{}
 			json.Unmarshal(dat, &c)
+		CONTENT:
 			for _, pp := range c {
-				if pp.Type == "text" {
+				switch pp.Type {
+				case "text":
 					input.Messages[i].StringContent = pp.Text
-				} else if pp.Type == "image_url" {
-					// Detect if pp.ImageURL is an URL, if it is download the image and encode it in base64:
-					base64, err := utils.GetImageURLAsBase64(pp.ImageURL.URL)
-					if err == nil {
-						input.Messages[i].StringImages = append(input.Messages[i].StringImages, base64) // TODO: make sure that we only return base64 stuff
-						// set a placeholder for each image
-						input.Messages[i].StringContent = fmt.Sprintf("[img-%d]", index) + input.Messages[i].StringContent
-						index++
-					} else {
-						log.Error().Msgf("Failed encoding image: %s", err)
+				case "video", "video_url":
+					// Decode content as base64 either if it's an URL or base64 text
+					base64, err := utils.GetContentURIAsBase64(pp.VideoURL.URL)
+					if err != nil {
+						log.Error().Msgf("Failed encoding video: %s", err)
+						continue CONTENT
 					}
+					input.Messages[i].StringVideos = append(input.Messages[i].StringVideos, base64) // TODO: make sure that we only return base64 stuff
+					// set a placeholder for each image
+					input.Messages[i].StringContent = fmt.Sprintf("[vid-%d]", vidIndex) + input.Messages[i].StringContent
+					vidIndex++
+				case "image_url", "image":
+					// Decode content as base64 either if it's an URL or base64 text
+
+					base64, err := utils.GetContentURIAsBase64(pp.ImageURL.URL)
+					if err != nil {
+						log.Error().Msgf("Failed encoding image: %s", err)
+						continue CONTENT
+					}
+					input.Messages[i].StringImages = append(input.Messages[i].StringImages, base64) // TODO: make sure that we only return base64 stuff
+					// set a placeholder for each image
+					input.Messages[i].StringContent = fmt.Sprintf("[img-%d]", imgIndex) + input.Messages[i].StringContent
+					imgIndex++
 				}
 			}
 		}
diff --git a/core/schema/openai.go b/core/schema/openai.go
index fe4745bf..32ed716b 100644
--- a/core/schema/openai.go
+++ b/core/schema/openai.go
@@ -58,6 +58,7 @@ type Content struct {
 	Type     string     `json:"type" yaml:"type"`
 	Text     string     `json:"text" yaml:"text"`
 	ImageURL ContentURL `json:"image_url" yaml:"image_url"`
+	VideoURL ContentURL `json:"video_url" yaml:"video_url"`
 }
 
 type ContentURL struct {
@@ -76,6 +77,7 @@ type Message struct {
 
 	StringContent string   `json:"string_content,omitempty" yaml:"string_content,omitempty"`
 	StringImages  []string `json:"string_images,omitempty" yaml:"string_images,omitempty"`
+	StringVideos  []string `json:"string_videos,omitempty" yaml:"string_videos,omitempty"`
 
 	// A result of a function call
 	FunctionCall interface{} `json:"function_call,omitempty" yaml:"function_call,omitempty"`
diff --git a/pkg/utils/base64.go b/pkg/utils/base64.go
index 3fbb405b..50109eaa 100644
--- a/pkg/utils/base64.go
+++ b/pkg/utils/base64.go
@@ -13,14 +13,8 @@ var base64DownloadClient http.Client = http.Client{
 	Timeout: 30 * time.Second,
 }
 
-// this function check if the string is an URL, if it's an URL downloads the image in memory
-// encodes it in base64 and returns the base64 string
-
-// This may look weird down in pkg/utils while it is currently only used in core/config
-//
-//	but I believe it may be useful for MQTT as well in the near future, so I'm
-//	extracting it while I'm thinking of it.
-func GetImageURLAsBase64(s string) (string, error) {
+// GetContentURIAsBase64 checks if the string is an URL, if it's an URL downloads the content in memory encodes it in base64 and returns the base64 string, otherwise returns the string by stripping base64 data headers
+func GetContentURIAsBase64(s string) (string, error) {
 	if strings.HasPrefix(s, "http") {
 		// download the image
 		resp, err := base64DownloadClient.Get(s)
diff --git a/pkg/utils/base64_test.go b/pkg/utils/base64_test.go
index 3b3dc9fb..1f0d1352 100644
--- a/pkg/utils/base64_test.go
+++ b/pkg/utils/base64_test.go
@@ -10,20 +10,20 @@ var _ = Describe("utils/base64 tests", func() {
 	It("GetImageURLAsBase64 can strip jpeg data url prefixes", func() {
 		// This one doesn't actually _care_ that it's base64, so feed "bad" data in this test in order to catch a change in that behavior for informational purposes.
 		input := "data:image/jpeg;base64,FOO"
-		b64, err := GetImageURLAsBase64(input)
+		b64, err := GetContentURIAsBase64(input)
 		Expect(err).To(BeNil())
 		Expect(b64).To(Equal("FOO"))
 	})
 	It("GetImageURLAsBase64 can strip png data url prefixes", func() {
 		// This one doesn't actually _care_ that it's base64, so feed "bad" data in this test in order to catch a change in that behavior for informational purposes.
 		input := "data:image/png;base64,BAR"
-		b64, err := GetImageURLAsBase64(input)
+		b64, err := GetContentURIAsBase64(input)
 		Expect(err).To(BeNil())
 		Expect(b64).To(Equal("BAR"))
 	})
 	It("GetImageURLAsBase64 returns an error for bogus data", func() {
 		input := "FOO"
-		b64, err := GetImageURLAsBase64(input)
+		b64, err := GetContentURIAsBase64(input)
 		Expect(b64).To(Equal(""))
 		Expect(err).ToNot(BeNil())
 		Expect(err).To(MatchError("not valid string"))
@@ -31,7 +31,7 @@ var _ = Describe("utils/base64 tests", func() {
 	It("GetImageURLAsBase64 can actually download images and calculates something", func() {
 		// This test doesn't actually _check_ the results at this time, which is bad, but there wasn't a test at all before...
 		input := "https://upload.wikimedia.org/wikipedia/en/2/29/Wargames.jpg"
-		b64, err := GetImageURLAsBase64(input)
+		b64, err := GetContentURIAsBase64(input)
 		Expect(err).To(BeNil())
 		Expect(b64).ToNot(BeNil())
 	})

From 191bc2e50a721bd3164ad4700bcbb5d723ed7d03 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Thu, 19 Sep 2024 12:26:53 +0200
Subject: [PATCH 037/122] feat(api): allow to pass audios to backends (#3603)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 backend/backend.proto                   |  1 +
 core/backend/llm.go                     |  3 ++-
 core/http/endpoints/openai/chat.go      |  6 +++++-
 core/http/endpoints/openai/inference.go |  6 +++++-
 core/http/endpoints/openai/request.go   | 14 ++++++++++++--
 core/schema/openai.go                   |  2 ++
 6 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/backend/backend.proto b/backend/backend.proto
index 6ef83567..31bd63e5 100644
--- a/backend/backend.proto
+++ b/backend/backend.proto
@@ -135,6 +135,7 @@ message PredictOptions {
   bool UseTokenizerTemplate = 43;
   repeated Message Messages = 44;
   repeated string Videos = 45;
+  repeated string Audios = 46;
 }
 
 // The response message containing the result
diff --git a/core/backend/llm.go b/core/backend/llm.go
index fa4c0709..f74071ba 100644
--- a/core/backend/llm.go
+++ b/core/backend/llm.go
@@ -31,7 +31,7 @@ type TokenUsage struct {
 	Completion int
 }
 
-func ModelInference(ctx context.Context, s string, messages []schema.Message, images, videos []string, loader *model.ModelLoader, c config.BackendConfig, o *config.ApplicationConfig, tokenCallback func(string, TokenUsage) bool) (func() (LLMResponse, error), error) {
+func ModelInference(ctx context.Context, s string, messages []schema.Message, images, videos, audios []string, loader *model.ModelLoader, c config.BackendConfig, o *config.ApplicationConfig, tokenCallback func(string, TokenUsage) bool) (func() (LLMResponse, error), error) {
 	modelFile := c.Model
 	threads := c.Threads
 	if *threads == 0 && o.Threads != 0 {
@@ -102,6 +102,7 @@ func ModelInference(ctx context.Context, s string, messages []schema.Message, im
 		opts.UseTokenizerTemplate = c.TemplateConfig.UseTokenizerTemplate
 		opts.Images = images
 		opts.Videos = videos
+		opts.Audios = audios
 
 		tokenUsage := TokenUsage{}
 
diff --git a/core/http/endpoints/openai/chat.go b/core/http/endpoints/openai/chat.go
index 742a4add..b937120a 100644
--- a/core/http/endpoints/openai/chat.go
+++ b/core/http/endpoints/openai/chat.go
@@ -644,8 +644,12 @@ func handleQuestion(config *config.BackendConfig, input *schema.OpenAIRequest, m
 	for _, m := range input.Messages {
 		videos = append(videos, m.StringVideos...)
 	}
+	audios := []string{}
+	for _, m := range input.Messages {
+		audios = append(audios, m.StringAudios...)
+	}
 
-	predFunc, err := backend.ModelInference(input.Context, prompt, input.Messages, images, videos, ml, *config, o, nil)
+	predFunc, err := backend.ModelInference(input.Context, prompt, input.Messages, images, videos, audios, ml, *config, o, nil)
 	if err != nil {
 		log.Error().Err(err).Msg("model inference failed")
 		return "", err
diff --git a/core/http/endpoints/openai/inference.go b/core/http/endpoints/openai/inference.go
index 4008ba3d..da75d3a1 100644
--- a/core/http/endpoints/openai/inference.go
+++ b/core/http/endpoints/openai/inference.go
@@ -31,9 +31,13 @@ func ComputeChoices(
 	for _, m := range req.Messages {
 		videos = append(videos, m.StringVideos...)
 	}
+	audios := []string{}
+	for _, m := range req.Messages {
+		audios = append(audios, m.StringAudios...)
+	}
 
 	// get the model function to call for the result
-	predFunc, err := backend.ModelInference(req.Context, predInput, req.Messages, images, videos, loader, *config, o, tokenCallback)
+	predFunc, err := backend.ModelInference(req.Context, predInput, req.Messages, images, videos, audios, loader, *config, o, tokenCallback)
 	if err != nil {
 		return result, backend.TokenUsage{}, err
 	}
diff --git a/core/http/endpoints/openai/request.go b/core/http/endpoints/openai/request.go
index 456a1e0c..e24dd28f 100644
--- a/core/http/endpoints/openai/request.go
+++ b/core/http/endpoints/openai/request.go
@@ -135,7 +135,7 @@ func updateRequestConfig(config *config.BackendConfig, input *schema.OpenAIReque
 	}
 
 	// Decode each request's message content
-	imgIndex, vidIndex := 0, 0
+	imgIndex, vidIndex, audioIndex := 0, 0, 0
 	for i, m := range input.Messages {
 		switch content := m.Content.(type) {
 		case string:
@@ -160,9 +160,19 @@ func updateRequestConfig(config *config.BackendConfig, input *schema.OpenAIReque
 					// set a placeholder for each image
 					input.Messages[i].StringContent = fmt.Sprintf("[vid-%d]", vidIndex) + input.Messages[i].StringContent
 					vidIndex++
+				case "audio_url", "audio":
+					// Decode content as base64 either if it's an URL or base64 text
+					base64, err := utils.GetContentURIAsBase64(pp.AudioURL.URL)
+					if err != nil {
+						log.Error().Msgf("Failed encoding image: %s", err)
+						continue CONTENT
+					}
+					input.Messages[i].StringAudios = append(input.Messages[i].StringAudios, base64) // TODO: make sure that we only return base64 stuff
+					// set a placeholder for each image
+					input.Messages[i].StringContent = fmt.Sprintf("[audio-%d]", audioIndex) + input.Messages[i].StringContent
+					audioIndex++
 				case "image_url", "image":
 					// Decode content as base64 either if it's an URL or base64 text
-
 					base64, err := utils.GetContentURIAsBase64(pp.ImageURL.URL)
 					if err != nil {
 						log.Error().Msgf("Failed encoding image: %s", err)
diff --git a/core/schema/openai.go b/core/schema/openai.go
index 32ed716b..15bcd13d 100644
--- a/core/schema/openai.go
+++ b/core/schema/openai.go
@@ -58,6 +58,7 @@ type Content struct {
 	Type     string     `json:"type" yaml:"type"`
 	Text     string     `json:"text" yaml:"text"`
 	ImageURL ContentURL `json:"image_url" yaml:"image_url"`
+	AudioURL ContentURL `json:"audio_url" yaml:"audio_url"`
 	VideoURL ContentURL `json:"video_url" yaml:"video_url"`
 }
 
@@ -78,6 +79,7 @@ type Message struct {
 	StringContent string   `json:"string_content,omitempty" yaml:"string_content,omitempty"`
 	StringImages  []string `json:"string_images,omitempty" yaml:"string_images,omitempty"`
 	StringVideos  []string `json:"string_videos,omitempty" yaml:"string_videos,omitempty"`
+	StringAudios  []string `json:"string_audios,omitempty" yaml:"string_audios,omitempty"`
 
 	// A result of a function call
 	FunctionCall interface{} `json:"function_call,omitempty" yaml:"function_call,omitempty"`

From 5c9d26e39bdff8c3e836c686a83d1aba3c239893 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Fri, 20 Sep 2024 10:49:32 +0200
Subject: [PATCH 038/122] feat(swagger): update swagger (#3604)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 swagger/docs.go      | 12 ++++++++++++
 swagger/swagger.json | 12 ++++++++++++
 swagger/swagger.yaml |  8 ++++++++
 3 files changed, 32 insertions(+)

diff --git a/swagger/docs.go b/swagger/docs.go
index 44da7cf2..ffb2ba03 100644
--- a/swagger/docs.go
+++ b/swagger/docs.go
@@ -1394,6 +1394,12 @@ const docTemplate = `{
                     "description": "The message role",
                     "type": "string"
                 },
+                "string_audios": {
+                    "type": "array",
+                    "items": {
+                        "type": "string"
+                    }
+                },
                 "string_content": {
                     "type": "string"
                 },
@@ -1403,6 +1409,12 @@ const docTemplate = `{
                         "type": "string"
                     }
                 },
+                "string_videos": {
+                    "type": "array",
+                    "items": {
+                        "type": "string"
+                    }
+                },
                 "tool_calls": {
                     "type": "array",
                     "items": {
diff --git a/swagger/swagger.json b/swagger/swagger.json
index eaddf451..e3aebe43 100644
--- a/swagger/swagger.json
+++ b/swagger/swagger.json
@@ -1387,6 +1387,12 @@
                     "description": "The message role",
                     "type": "string"
                 },
+                "string_audios": {
+                    "type": "array",
+                    "items": {
+                        "type": "string"
+                    }
+                },
                 "string_content": {
                     "type": "string"
                 },
@@ -1396,6 +1402,12 @@
                         "type": "string"
                     }
                 },
+                "string_videos": {
+                    "type": "array",
+                    "items": {
+                        "type": "string"
+                    }
+                },
                 "tool_calls": {
                     "type": "array",
                     "items": {
diff --git a/swagger/swagger.yaml b/swagger/swagger.yaml
index c98e0ef4..649b86e4 100644
--- a/swagger/swagger.yaml
+++ b/swagger/swagger.yaml
@@ -453,12 +453,20 @@ definitions:
       role:
         description: The message role
         type: string
+      string_audios:
+        items:
+          type: string
+        type: array
       string_content:
         type: string
       string_images:
         items:
           type: string
         type: array
+      string_videos:
+        items:
+          type: string
+        type: array
       tool_calls:
         items:
           $ref: '#/definitions/schema.ToolCall'

From 2fcea486eb72d0a0bd77513244d66c74a3ec8a47 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Fri, 20 Sep 2024 10:50:14 +0200
Subject: [PATCH 039/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `6026da52d6942b253df835070619775d849d0258` (#3605)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 286f4b5a..53def128 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=64c6af3195c3cd4aa3328a1282d29cd2635c34c9
+CPPLLAMA_VERSION?=6026da52d6942b253df835070619775d849d0258
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From a2a63460e92b042f274d0a4e126ef927ef78e25a Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 10:59:29 +0200
Subject: [PATCH 040/122] models(gallery): add qwen2.5-14b-instruct (#3607)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 229697bb..4fe495fc 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -1,4 +1,27 @@
 ---
+## Qwen2.5
+- &qwen25
+  name: "qwen2.5-14b-instruct"
+  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
+  license: apache-2.0
+  description: |
+    Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters.
+  tags:
+    - llm
+    - gguf
+    - gpu
+    - qwen
+    - cpu
+  urls:
+    - https://huggingface.co/bartowski/Qwen2.5-14B-Instruct-GGUF
+    - https://huggingface.co/Qwen/Qwen2.5-7B-Instruct
+  overrides:
+    parameters:
+      model: Qwen2.5-14B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen2.5-14B-Instruct-Q4_K_M.gguf
+      sha256: e47ad95dad6ff848b431053b375adb5d39321290ea2c638682577dafca87c008
+      uri: huggingface://bartowski/Qwen2.5-14B-Instruct-GGUF/Qwen2.5-14B-Instruct-Q4_K_M.gguf
 ## SmolLM
 - &smollm
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"

From c15f506fd511dc3208846753e1fded4d0a4191f0 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 11:18:49 +0200
Subject: [PATCH 041/122] models(gallery): add qwen2.5-math-7b-instruct (#3609)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 4fe495fc..8dc742ca 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -22,6 +22,24 @@
     - filename: Qwen2.5-14B-Instruct-Q4_K_M.gguf
       sha256: e47ad95dad6ff848b431053b375adb5d39321290ea2c638682577dafca87c008
       uri: huggingface://bartowski/Qwen2.5-14B-Instruct-GGUF/Qwen2.5-14B-Instruct-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "qwen2.5-math-7b-instruct"
+  urls:
+    - https://huggingface.co/bartowski/Qwen2.5-Math-7B-Instruct-GGUF
+    - https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct
+  description: |
+      In August 2024, we released the first series of mathematical LLMs - Qwen2-Math - of our Qwen family. A month later, we have upgraded it and open-sourced Qwen2.5-Math series, including base models Qwen2.5-Math-1.5B/7B/72B, instruction-tuned models Qwen2.5-Math-1.5B/7B/72B-Instruct, and mathematical reward model Qwen2.5-Math-RM-72B.
+
+      Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT.
+
+      The base models of Qwen2-Math are initialized with Qwen2-1.5B/7B/72B, and then pretrained on a meticulously designed Mathematics-specific Corpus. This corpus contains large-scale high-quality mathematical web texts, books, codes, exam questions, and mathematical pre-training data synthesized by Qwen2.
+  overrides:
+    parameters:
+      model: Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf
+      sha256: 7e03cee8c65b9ebf9ca14ddb010aca27b6b18e6c70f2779e94e7451d9529c091
+      uri: huggingface://bartowski/Qwen2.5-Math-7B-Instruct-GGUF/Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf
 ## SmolLM
 - &smollm
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"

From a5b08f43ff5a3f485264dd0b8bd6335b0bf4ce24 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 11:22:53 +0200
Subject: [PATCH 042/122] models(gallery): add qwen2.5-14b_uncencored (#3610)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 8dc742ca..77c5c107 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -11,6 +11,7 @@
     - gguf
     - gpu
     - qwen
+    - qwen2.5
     - cpu
   urls:
     - https://huggingface.co/bartowski/Qwen2.5-14B-Instruct-GGUF
@@ -40,6 +41,31 @@
     - filename: Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf
       sha256: 7e03cee8c65b9ebf9ca14ddb010aca27b6b18e6c70f2779e94e7451d9529c091
       uri: huggingface://bartowski/Qwen2.5-Math-7B-Instruct-GGUF/Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "qwen2.5-14b_uncencored"
+  icon: https://huggingface.co/SicariusSicariiStuff/Phi-3.5-mini-instruct_Uncensored/resolve/main/Misc/Uncensored.png
+  urls:
+    - https://huggingface.co/SicariusSicariiStuff/Qwen2.5-14B_Uncencored
+    - https://huggingface.co/bartowski/Qwen2.5-14B_Uncencored-GGUF
+  description: |
+    Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters.
+
+    Uncensored qwen2.5
+  tags:
+    - llm
+    - gguf
+    - gpu
+    - qwen
+    - qwen2.5
+    - cpu
+    - uncensored
+  overrides:
+    parameters:
+      model: Qwen2.5-14B_Uncencored-Q4_K_M.gguf
+  files:
+    - filename: Qwen2.5-14B_Uncencored-Q4_K_M.gguf
+      sha256: 066b9341b67e0fd0956de3576a3b7988574a5b9a0028aef2b9c8edeadd6dbbd1
+      uri: huggingface://bartowski/Qwen2.5-14B_Uncencored-GGUF/Qwen2.5-14B_Uncencored-Q4_K_M.gguf
 ## SmolLM
 - &smollm
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"

From b6af4f4467724bd9d59e6f7f573f513f927fc8e2 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 15:08:57 +0200
Subject: [PATCH 043/122] models(gallery): add qwen2.5-coder-7b-instruct
 (#3611)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 77c5c107..1f52fec8 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -66,6 +66,24 @@
     - filename: Qwen2.5-14B_Uncencored-Q4_K_M.gguf
       sha256: 066b9341b67e0fd0956de3576a3b7988574a5b9a0028aef2b9c8edeadd6dbbd1
       uri: huggingface://bartowski/Qwen2.5-14B_Uncencored-GGUF/Qwen2.5-14B_Uncencored-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "qwen2.5-coder-7b-instruct"
+  urls:
+    - https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct
+    - https://huggingface.co/bartowski/Qwen2.5-Coder-7B-Instruct-GGUF
+  description: |
+    Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). For Qwen2.5-Coder, we release three base language models and instruction-tuned language models, 1.5, 7 and 32 (coming soon) billion parameters. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:
+
+        Significantly improvements in code generation, code reasoning and code fixing. Base on the strong Qwen2.5, we scale up the training tokens into 5.5 trillion including source code, text-code grounding, Synthetic data, etc.
+        A more comprehensive foundation for real-world applications such as Code Agents. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies.
+        Long-context Support up to 128K tokens.
+  overrides:
+    parameters:
+      model: Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
+      sha256: 1664fccab734674a50763490a8c6931b70e3f2f8ec10031b54806d30e5f956b6
+      uri: huggingface://bartowski/Qwen2.5-Coder-7B-Instruct-GGUF/Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
 ## SmolLM
 - &smollm
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"

From 56d8f5163c427eb0e0d3b9483aa4e585f571a0bf Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 15:12:35 +0200
Subject: [PATCH 044/122] models(gallery): add qwen2.5-math-72b-instruct
 (#3612)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 1f52fec8..945c45b9 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -84,6 +84,23 @@
     - filename: Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
       sha256: 1664fccab734674a50763490a8c6931b70e3f2f8ec10031b54806d30e5f956b6
       uri: huggingface://bartowski/Qwen2.5-Coder-7B-Instruct-GGUF/Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "qwen2.5-math-72b-instruct"
+  icon: http://qianwen-res.oss-accelerate-overseas.aliyuncs.com/Qwen2.5/qwen2.5-math-pipeline.jpeg
+  urls:
+    - https://huggingface.co/Qwen/Qwen2.5-Math-72B-Instruct
+    - https://huggingface.co/bartowski/Qwen2.5-Math-72B-Instruct-GGUF
+  description: |
+    In August 2024, we released the first series of mathematical LLMs - Qwen2-Math - of our Qwen family. A month later, we have upgraded it and open-sourced Qwen2.5-Math series, including base models Qwen2.5-Math-1.5B/7B/72B, instruction-tuned models Qwen2.5-Math-1.5B/7B/72B-Instruct, and mathematical reward model Qwen2.5-Math-RM-72B.
+
+    Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT
+  overrides:
+    parameters:
+      model: Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf
+      sha256: 5dee8a6e21d555577712b4f65565a3c3737a0d5d92f5a82970728c6d8e237f17
+      uri: huggingface://bartowski/Qwen2.5-Math-72B-Instruct-GGUF/Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf
 ## SmolLM
 - &smollm
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"

From 052af98dcd3a5d50cd1c7f2f0920b77e508ada5e Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 15:45:23 +0200
Subject: [PATCH 045/122] models(gallery): add qwen2.5-0.5b-instruct,
 qwen2.5-1.5b-instruct (#3613)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 945c45b9..adac3e51 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -101,6 +101,30 @@
     - filename: Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf
       sha256: 5dee8a6e21d555577712b4f65565a3c3737a0d5d92f5a82970728c6d8e237f17
       uri: huggingface://bartowski/Qwen2.5-Math-72B-Instruct-GGUF/Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "qwen2.5-0.5b-instruct"
+  urls:
+    - https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct
+    - https://huggingface.co/bartowski/Qwen2.5-0.5B-Instruct-GGUF
+  overrides:
+    parameters:
+      model: Qwen2.5-0.5B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen2.5-0.5B-Instruct-Q4_K_M.gguf
+      sha256: 6eb923e7d26e9cea28811e1a8e852009b21242fb157b26149d3b188f3a8c8653
+      uri: huggingface://bartowski/Qwen2.5-0.5B-Instruct-GGUF/Qwen2.5-0.5B-Instruct-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "qwen2.5-1.5b-instruct"
+  urls:
+    - https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct
+    - https://huggingface.co/bartowski/Qwen2.5-1.5B-Instruct-GGUF
+  overrides:
+    parameters:
+      model: Qwen2.5-1.5B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen2.5-1.5B-Instruct-Q4_K_M.gguf
+      sha256: 1adf0b11065d8ad2e8123ea110d1ec956dab4ab038eab665614adba04b6c3370
+      uri: huggingface://bartowski/Qwen2.5-1.5B-Instruct-GGUF/Qwen2.5-1.5B-Instruct-Q4_K_M.gguf
 ## SmolLM
 - &smollm
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"

From 38cad0b8dc32e3ce8d8650718c16df6725cb63dc Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 17:10:43 +0200
Subject: [PATCH 046/122] models(gallery): add qwen2.5 32B, 72B, 32B Instruct
 (#3614)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index adac3e51..5304f9d2 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -125,6 +125,42 @@
     - filename: Qwen2.5-1.5B-Instruct-Q4_K_M.gguf
       sha256: 1adf0b11065d8ad2e8123ea110d1ec956dab4ab038eab665614adba04b6c3370
       uri: huggingface://bartowski/Qwen2.5-1.5B-Instruct-GGUF/Qwen2.5-1.5B-Instruct-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "qwen2.5-32b"
+  urls:
+    - https://huggingface.co/Qwen/Qwen2.5-32B
+    - https://huggingface.co/mradermacher/Qwen2.5-32B-GGUF
+  overrides:
+    parameters:
+      model: Qwen2.5-32B.Q4_K_M.gguf
+  files:
+    - filename: Qwen2.5-32B.Q4_K_M.gguf
+      sha256: 02703e27c8b964db445444581a6937ad7538f0c32a100b26b49fa0e8ff527155
+      uri: huggingface://mradermacher/Qwen2.5-32B-GGUF/Qwen2.5-32B.Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "qwen2.5-32b-instruct"
+  urls:
+    - https://huggingface.co/Qwen/Qwen2.5-32B-Instruct
+    - https://huggingface.co/bartowski/Qwen2.5-32B-Instruct-GGUF
+  overrides:
+    parameters:
+      model: Qwen2.5-32B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen2.5-32B-Instruct-Q4_K_M.gguf
+      sha256: 2e5f6daea180dbc59f65a40641e94d3973b5dbaa32b3c0acf54647fa874e519e
+      uri: huggingface://bartowski/Qwen2.5-32B-Instruct-GGUF/Qwen2.5-32B-Instruct-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "qwen2.5-72b-instruct"
+  urls:
+    - https://huggingface.co/Qwen/Qwen2.5-72B-Instruct
+    - https://huggingface.co/bartowski/Qwen2.5-72B-Instruct-GGUF
+  overrides:
+    parameters:
+      model: Qwen2.5-72B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen2.5-72B-Instruct-Q4_K_M.gguf
+      sha256: e4c8fad16946be8cf0bbf67eb8f4e18fc7415a5a6d2854b4cda453edb4082545
+      uri: huggingface://bartowski/Qwen2.5-72B-Instruct-GGUF/Qwen2.5-72B-Instruct-Q4_K_M.gguf
 ## SmolLM
 - &smollm
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"

From c4cecba07fda9c9db738aaaaa40756fbee3e879b Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 17:19:53 +0200
Subject: [PATCH 047/122] models(gallery): add
 llama-3.1-supernova-lite-reflection-v1.0-i1 (#3615)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 5304f9d2..60eed4ce 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -601,6 +601,22 @@
     - filename: Reflection-Llama-3.1-70B-q4_k_m.gguf
       sha256: 16064e07037883a750cfeae9a7be41143aa857dbac81c2e93c68e2f941dee7b2
       uri: huggingface://senseable/Reflection-Llama-3.1-70B-gguf/Reflection-Llama-3.1-70B-q4_k_m.gguf
+- !!merge <<: *llama31
+  name: "llama-3.1-supernova-lite-reflection-v1.0-i1"
+  url: "github:mudler/LocalAI/gallery/llama3.1-reflective.yaml@master"
+  icon: https://i.ibb.co/r072p7j/eopi-ZVu-SQ0-G-Cav78-Byq-Tg.png
+  urls:
+    - https://huggingface.co/SE6446/Llama-3.1-SuperNova-Lite-Reflection-V1.0
+    - https://huggingface.co/mradermacher/Llama-3.1-SuperNova-Lite-Reflection-V1.0-i1-GGUF
+  description: |
+    This model is a LoRA adaptation of arcee-ai/Llama-3.1-SuperNova-Lite on thesven/Reflective-MAGLLAMA-v0.1.1. This has been a simple experiment into reflection and the model appears to perform adequately, though I am unsure if it is a large improvement.
+  overrides:
+    parameters:
+      model: Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf
+  files:
+    - filename: Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf
+      sha256: 0c4531fe553d00142808e1bc7348ae92d400794c5b64d2db1a974718324dfe9a
+      uri: huggingface://mradermacher/Llama-3.1-SuperNova-Lite-Reflection-V1.0-i1-GGUF/Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf
 ## Uncensored models
 - !!merge <<: *llama31
   name: "humanish-roleplay-llama-3.1-8b-i1"

From e24654ada064f0b7f6a2eb2be29b8136e52ccc0b Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 17:23:30 +0200
Subject: [PATCH 048/122] models(gallery): add llama-3.1-supernova-lite (#3616)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 60eed4ce..c05593b1 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -617,6 +617,25 @@
     - filename: Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf
       sha256: 0c4531fe553d00142808e1bc7348ae92d400794c5b64d2db1a974718324dfe9a
       uri: huggingface://mradermacher/Llama-3.1-SuperNova-Lite-Reflection-V1.0-i1-GGUF/Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf
+- !!merge <<: *llama31
+  name: "llama-3.1-supernova-lite"
+  icon: https://i.ibb.co/r072p7j/eopi-ZVu-SQ0-G-Cav78-Byq-Tg.png
+  urls:
+    - https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite
+    - https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite-GGUF
+  description: |
+    Llama-3.1-SuperNova-Lite is an 8B parameter model developed by Arcee.ai, based on the Llama-3.1-8B-Instruct architecture. It is a distilled version of the larger Llama-3.1-405B-Instruct model, leveraging offline logits extracted from the 405B parameter variant. This 8B variation of Llama-3.1-SuperNova maintains high performance while offering exceptional instruction-following capabilities and domain-specific adaptability.
+
+    The model was trained using a state-of-the-art distillation pipeline and an instruction dataset generated with EvolKit, ensuring accuracy and efficiency across a wide range of tasks. For more information on its training, visit blog.arcee.ai.
+
+    Llama-3.1-SuperNova-Lite excels in both benchmark performance and real-world applications, providing the power of large-scale models in a more compact, efficient form ideal for organizations seeking high performance with reduced resource requirements.
+  overrides:
+    parameters:
+      model: supernova-lite-v1.Q4_K_M.gguf
+  files:
+    - filename: supernova-lite-v1.Q4_K_M.gguf
+      sha256: 237b7b0b704d294f92f36c576cc8fdc10592f95168a5ad0f075a2d8edf20da4d
+      uri: huggingface://arcee-ai/Llama-3.1-SuperNova-Lite-GGUF/supernova-lite-v1.Q4_K_M.gguf
 ## Uncensored models
 - !!merge <<: *llama31
   name: "humanish-roleplay-llama-3.1-8b-i1"

From f55053bfbaa9a71ea72b9efb0aa4f5347dc34574 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 17:26:59 +0200
Subject: [PATCH 049/122] models(gallery): add llama3.1-8b-shiningvaliant2
 (#3617)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index c05593b1..3c3b1a23 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -636,6 +636,24 @@
     - filename: supernova-lite-v1.Q4_K_M.gguf
       sha256: 237b7b0b704d294f92f36c576cc8fdc10592f95168a5ad0f075a2d8edf20da4d
       uri: huggingface://arcee-ai/Llama-3.1-SuperNova-Lite-GGUF/supernova-lite-v1.Q4_K_M.gguf
+- !!merge <<: *llama31
+  name: "llama3.1-8b-shiningvaliant2"
+  icon: https://cdn-uploads.huggingface.co/production/uploads/63444f2687964b331809eb55/EXX7TKbB-R6arxww2mk0R.jpeg
+  urls:
+    - https://huggingface.co/ValiantLabs/Llama3.1-8B-ShiningValiant2
+    - https://huggingface.co/bartowski/Llama3.1-8B-ShiningValiant2-GGUF
+  description: |
+    Shining Valiant 2 is a chat model built on Llama 3.1 8b, finetuned on our data for friendship, insight, knowledge and enthusiasm.
+
+        Finetuned on meta-llama/Meta-Llama-3.1-8B-Instruct for best available general performance
+        Trained on a variety of high quality data; focused on science, engineering, technical knowledge, and structured reasoning
+  overrides:
+    parameters:
+      model: Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf
+  files:
+    - filename: Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf
+      sha256: 9369eb97922a9f01e4eae610e3d7aaeca30762d78d9239884179451d60bdbdd2
+      uri: huggingface://bartowski/Llama3.1-8B-ShiningValiant2-GGUF/Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf
 ## Uncensored models
 - !!merge <<: *llama31
   name: "humanish-roleplay-llama-3.1-8b-i1"

From 415cf31aa3e51aa44f1097d0459f8d410e3adb27 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 17:33:29 +0200
Subject: [PATCH 050/122] models(gallery): add buddy2 (#3618)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 3c3b1a23..b46967ad 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -2065,6 +2065,20 @@
     - filename: datagemma-rig-27b-it-Q4_K_M.gguf
       sha256: a6738ffbb49b6c46d220e2793df85c0538e9ac72398e32a0914ee5e55c3096ad
       uri: huggingface://bartowski/datagemma-rig-27b-it-GGUF/datagemma-rig-27b-it-Q4_K_M.gguf
+- !!merge <<: *gemma
+  name: "buddy-2b-v1"
+  urls:
+    - https://huggingface.co/TheDrummer/Buddy-2B-v1
+    - https://huggingface.co/bartowski/Buddy-2B-v1-GGUF
+  description: |
+    Buddy is designed as an empathetic language model, aimed at fostering introspection, self-reflection, and personal growth through thoughtful conversation. Buddy won't judge and it won't dismiss your concerns. Get some self-care with Buddy.
+  overrides:
+    parameters:
+      model: Buddy-2B-v1-Q4_K_M.gguf
+  files:
+    - filename: Buddy-2B-v1-Q4_K_M.gguf
+      sha256: 9bd25ed907d1a3c2e07fe09399a9b3aec107d368c29896e2c46facede5b7e3d5
+      uri: huggingface://bartowski/Buddy-2B-v1-GGUF/Buddy-2B-v1-Q4_K_M.gguf
 - &llama3
   url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master"
   icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png

From 00d6c2a96683ffc6d169ecaeeaa9d5c5bb8384f1 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@localai.io>
Date: Fri, 20 Sep 2024 17:35:06 +0200
Subject: [PATCH 051/122] models(gallery): add llama3.1-reflective config

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/llama3.1-reflective.yaml | 65 ++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)
 create mode 100644 gallery/llama3.1-reflective.yaml

diff --git a/gallery/llama3.1-reflective.yaml b/gallery/llama3.1-reflective.yaml
new file mode 100644
index 00000000..86a91d8b
--- /dev/null
+++ b/gallery/llama3.1-reflective.yaml
@@ -0,0 +1,65 @@
+---
+name: "llama3-instruct"
+
+config_file: |
+  mmap: true
+  cutstrings:
+  - (.*?)</thinking>
+  function:
+    disable_no_action: true
+    grammar:
+      disable: true
+    response_regex:
+    - <function=(?P<name>\w+)>(?P<arguments>.*)</function>
+  template:
+    chat_message: |
+      <|start_header_id|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}<|end_header_id|>
+
+      {{ if .FunctionCall -}}
+      Function call:
+      {{ else if eq .RoleName "tool" -}}
+      Function response:
+      {{ end -}}
+      {{ if .Content -}}
+      {{.Content -}}
+      {{ else if .FunctionCall -}}
+      {{ toJson .FunctionCall -}}
+      {{ end -}}
+      <|eot_id|>
+    function: |
+      <|start_header_id|>system<|end_header_id|>
+
+      You have access to the following functions:
+
+      {{range .Functions}}
+      Use the function '{{.Name}}' to '{{.Description}}'
+      {{toJson .Parameters}}
+      {{end}}
+
+      Think very carefully before calling functions.
+      If a you choose to call a function ONLY reply in the following format with no prefix or suffix:
+
+      <function=example_function_name>{{`{{"example_name": "example_value"}}`}}</function>
+
+      Reminder:
+      - If looking for real time information use relevant functions before falling back to searching on internet
+      - Function calls MUST follow the specified format, start with <function= and end with </function>
+      - Required parameters MUST be specified
+      - Only call one function at a time
+      - Put the entire function call reply on one line
+      <|eot_id|>
+      {{.Input }}
+      <|start_header_id|>assistant<|end_header_id|>
+    chat: |
+      {{.Input }}
+      <|start_header_id|>assistant<|end_header_id|>
+      <thinking>
+    completion: |
+      {{.Input}}
+  context_size: 8192
+  f16: true
+  stopwords:
+  - <|im_end|>
+  - <dummy32000>
+  - "<|eot_id|>"
+  - <|end_of_text|>

From 6c6cd8bbe0af9c93560b5eb20b8153d53625ac63 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 18:15:51 +0200
Subject: [PATCH 052/122] models(gallery): add llama-3.1-8b-arliai-rpmax-v1.1
 (#3619)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index b46967ad..59cab687 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -956,6 +956,20 @@
     - filename: Llama-3.1-8B-Stheno-v3.4-Q4_K_M-imat.gguf
       sha256: 830d4858aa11a654f82f69fa40dee819edf9ecf54213057648304eb84b8dd5eb
       uri: huggingface://Lewdiculous/Llama-3.1-8B-Stheno-v3.4-GGUF-IQ-Imatrix/Llama-3.1-8B-Stheno-v3.4-Q4_K_M-imat.gguf
+- !!merge <<: *llama31
+  name: "llama-3.1-8b-arliai-rpmax-v1.1"
+  urls:
+    - https://huggingface.co/ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.1
+    - https://huggingface.co/bartowski/Llama-3.1-8B-ArliAI-RPMax-v1.1-GGUF
+  description: |
+    RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations.
+  overrides:
+    parameters:
+      model: Llama-3.1-8B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
+  files:
+    - filename: Llama-3.1-8B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
+      sha256: 0a601c7341228d9160332965298d799369a1dc2b7080771fb8051bdeb556b30c
+      uri: huggingface://bartowski/Llama-3.1-8B-ArliAI-RPMax-v1.1-GGUF/Llama-3.1-8B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
 - &deepseek
   ## Deepseek
   url: "github:mudler/LocalAI/gallery/deepseek.yaml@master"

From bf8e50a11d2aa2ae3e27c770812a402c5c8cc6eb Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 20 Sep 2024 18:16:01 +0200
Subject: [PATCH 053/122] chore(docs): add Vulkan images links (#3620)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 docs/content/docs/getting-started/container-images.md | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/docs/content/docs/getting-started/container-images.md b/docs/content/docs/getting-started/container-images.md
index 86fe31d1..25385f23 100644
--- a/docs/content/docs/getting-started/container-images.md
+++ b/docs/content/docs/getting-started/container-images.md
@@ -154,7 +154,7 @@ Images are available with and without python dependencies. Note that images with
 
 Images with `core` in the tag are smaller and do not contain any python dependencies. 
 
-{{< tabs tabTotal="6" >}}
+{{< tabs tabTotal="7" >}}
 {{% tab tabName="Vanilla / CPU Images" %}}
 
 | Description | Quay | Docker Hub                                   |
@@ -227,6 +227,15 @@ Images with `core` in the tag are smaller and do not contain any python dependen
 
 {{% /tab %}}
 
+
+{{% tab tabName="Vulkan Images" %}}
+| Description | Quay | Docker Hub                                                  |
+| --- | --- |-------------------------------------------------------------|
+| Latest images from the branch (development) | `quay.io/go-skynet/local-ai: master-vulkan-ffmpeg-core ` | `localai/localai: master-vulkan-ffmpeg-core `                      |
+| Latest tag | `quay.io/go-skynet/local-ai: latest-vulkan-ffmpeg-core ` | `localai/localai: latest-vulkan-ffmpeg-core`                 |
+| Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-vulkan-fmpeg-core` | `localai/localai:{{< version >}}-vulkan-fmpeg-core`             |
+{{% /tab %}}
+
 {{< /tabs >}}
 
 ## See Also

From cef7f8a0146474e1e30676ea820f8b5047bc73b2 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Fri, 20 Sep 2024 23:41:13 +0200
Subject: [PATCH 054/122] chore: :arrow_up: Update ggerganov/whisper.cpp to
 `34972dbe221709323714fc8402f2e24041d48213` (#3623)

:arrow_up: Update ggerganov/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 53def128..89b0d4aa 100644
--- a/Makefile
+++ b/Makefile
@@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6
 
 # whisper.cpp version
 WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp
-WHISPER_CPP_VERSION?=5b1ce40fa882e9cb8630b48032067a1ed2f1534f
+WHISPER_CPP_VERSION?=34972dbe221709323714fc8402f2e24041d48213
 
 # bert.cpp version
 BERT_REPO?=https://github.com/go-skynet/go-bert.cpp

From 54f2657870c73a100c69ad55c862cfc41f9da028 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Sat, 21 Sep 2024 10:09:41 +0200
Subject: [PATCH 055/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `63351143b2ea5efe9f8b9c61f553af8a51f1deff` (#3622)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 89b0d4aa..83fb1215 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=6026da52d6942b253df835070619775d849d0258
+CPPLLAMA_VERSION?=63351143b2ea5efe9f8b9c61f553af8a51f1deff
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From c22b3187a7179f1dc721d71c4e18742e173275aa Mon Sep 17 00:00:00 2001
From: lnyxaris <nyxaris@anomalous.news>
Date: Sat, 21 Sep 2024 10:10:27 +0200
Subject: [PATCH 056/122] Fix NeuralDaredevil URL (#3621)

Signed-off-by: lnyxaris <nyxaris@anomalous.news>
---
 gallery/index.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 59cab687..7dab9eb7 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -3972,7 +3972,7 @@
   files:
     - filename: NeuralDaredevil-8B-abliterated.Q4_K_M.gguf
       sha256: 12f4af9d66817d7d300bd9a181e4fe66f7ecf7ea972049f2cbd0554cdc3ecf05
-      uri: huggingface://QuantFactory/NeuralDaredevil-8B-abliterated-GGUF/Poppy_Porpoise-0.85-L3-8B-Q4_K_M-imat.gguf
+      uri: huggingface://QuantFactory/NeuralDaredevil-8B-abliterated-GGUF/NeuralDaredevil-8B-abliterated.Q4_K_M.gguf
 - !!merge <<: *llama3
   name: "llama-3-8b-instruct-mopeymule"
   urls:

From 5c3d1d81e63e823278c8630b4a2a3a93ddf6af0c Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Sat, 21 Sep 2024 16:04:04 +0200
Subject: [PATCH 057/122] fix(parler-tts): fix install with sycl (#3624)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 backend/python/parler-tts/install.sh | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/backend/python/parler-tts/install.sh b/backend/python/parler-tts/install.sh
index 002472a2..aae690c4 100755
--- a/backend/python/parler-tts/install.sh
+++ b/backend/python/parler-tts/install.sh
@@ -15,5 +15,12 @@ installRequirements
 
 # https://github.com/descriptinc/audiotools/issues/101
 # incompatible protobuf versions.
-PYDIR=$(ls ${MY_DIR}/venv/lib)
-curl -L https://raw.githubusercontent.com/protocolbuffers/protobuf/main/python/google/protobuf/internal/builder.py -o ${MY_DIR}/venv/lib/${PYDIR}/site-packages/google/protobuf/internal/builder.py
+PYDIR=python3.10
+pyenv="${MY_DIR}/venv/lib/${PYDIR}/site-packages/google/protobuf/internal/"
+
+if [ ! -d ${pyenv} ]; then
+    echo "(parler-tts/install.sh): Error: ${pyenv} does not exist"
+    exit 1
+fi
+
+curl -L https://raw.githubusercontent.com/protocolbuffers/protobuf/main/python/google/protobuf/internal/builder.py -o ${pyenv}/builder.py

From 20c0e128c00601edb7e46089c1e32672f353c52e Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Sat, 21 Sep 2024 21:52:12 +0200
Subject: [PATCH 058/122] fix(sycl): downgrade pypinyin

melotts requires pypinyin 0.50

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---
 backend/python/openvoice/requirements-intel.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/openvoice/requirements-intel.txt b/backend/python/openvoice/requirements-intel.txt
index cea7de0b..a9a4cc20 100644
--- a/backend/python/openvoice/requirements-intel.txt
+++ b/backend/python/openvoice/requirements-intel.txt
@@ -15,7 +15,7 @@ unidecode==1.3.7
 whisper-timestamped==1.15.4
 openai
 python-dotenv
-pypinyin==0.53.0
+pypinyin==0.50.0
 cn2an==0.5.22
 jieba==0.42.1
 gradio==4.38.1

From 1f43678d5311e7bdc434768ea74c97e49a6ebc7e Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Sun, 22 Sep 2024 00:03:23 +0200
Subject: [PATCH 059/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `d09770cae71b416c032ec143dda530f7413c4038` (#3626)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 83fb1215..51755e71 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=63351143b2ea5efe9f8b9c61f553af8a51f1deff
+CPPLLAMA_VERSION?=d09770cae71b416c032ec143dda530f7413c4038
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From ee21b00a8d6b652b61d075e3bba1b88c8d52488c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Serta=C3=A7=20=C3=96zercan?=
 <852750+sozercan@users.noreply.github.com>
Date: Sun, 22 Sep 2024 01:03:30 -0700
Subject: [PATCH 060/122] feat: auto load into memory on startup (#3627)

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
---
 core/backend/embeddings.go        |   2 +-
 core/backend/image.go             |   2 +-
 core/backend/llm.go               |   2 +-
 core/backend/options.go           |   2 +-
 core/backend/rerank.go            |   2 +-
 core/backend/soundgeneration.go   |   2 +-
 core/backend/tts.go               |   2 +-
 core/cli/run.go                   |   2 +
 core/config/application_config.go |   7 +
 core/startup/startup.go           | 449 ++++++++++++++++--------------
 10 files changed, 259 insertions(+), 213 deletions(-)

diff --git a/core/backend/embeddings.go b/core/backend/embeddings.go
index 31b10a19..9f0f8be9 100644
--- a/core/backend/embeddings.go
+++ b/core/backend/embeddings.go
@@ -12,7 +12,7 @@ import (
 func ModelEmbedding(s string, tokens []int, loader *model.ModelLoader, backendConfig config.BackendConfig, appConfig *config.ApplicationConfig) (func() ([]float32, error), error) {
 	modelFile := backendConfig.Model
 
-	grpcOpts := gRPCModelOpts(backendConfig)
+	grpcOpts := GRPCModelOpts(backendConfig)
 
 	var inferenceModel interface{}
 	var err error
diff --git a/core/backend/image.go b/core/backend/image.go
index 8c3f56b3..5c2a950c 100644
--- a/core/backend/image.go
+++ b/core/backend/image.go
@@ -12,7 +12,7 @@ func ImageGeneration(height, width, mode, step, seed int, positive_prompt, negat
 	if *threads == 0 && appConfig.Threads != 0 {
 		threads = &appConfig.Threads
 	}
-	gRPCOpts := gRPCModelOpts(backendConfig)
+	gRPCOpts := GRPCModelOpts(backendConfig)
 	opts := modelOpts(backendConfig, appConfig, []model.Option{
 		model.WithBackendString(backendConfig.Backend),
 		model.WithAssetDir(appConfig.AssetsDestination),
diff --git a/core/backend/llm.go b/core/backend/llm.go
index f74071ba..cac9beba 100644
--- a/core/backend/llm.go
+++ b/core/backend/llm.go
@@ -37,7 +37,7 @@ func ModelInference(ctx context.Context, s string, messages []schema.Message, im
 	if *threads == 0 && o.Threads != 0 {
 		threads = &o.Threads
 	}
-	grpcOpts := gRPCModelOpts(c)
+	grpcOpts := GRPCModelOpts(c)
 
 	var inferenceModel grpc.Backend
 	var err error
diff --git a/core/backend/options.go b/core/backend/options.go
index d986b8e6..d431aab6 100644
--- a/core/backend/options.go
+++ b/core/backend/options.go
@@ -44,7 +44,7 @@ func getSeed(c config.BackendConfig) int32 {
 	return seed
 }
 
-func gRPCModelOpts(c config.BackendConfig) *pb.ModelOptions {
+func GRPCModelOpts(c config.BackendConfig) *pb.ModelOptions {
 	b := 512
 	if c.Batch != 0 {
 		b = c.Batch
diff --git a/core/backend/rerank.go b/core/backend/rerank.go
index 1b718be2..a7573ade 100644
--- a/core/backend/rerank.go
+++ b/core/backend/rerank.go
@@ -15,7 +15,7 @@ func Rerank(backend, modelFile string, request *proto.RerankRequest, loader *mod
 		return nil, fmt.Errorf("backend is required")
 	}
 
-	grpcOpts := gRPCModelOpts(backendConfig)
+	grpcOpts := GRPCModelOpts(backendConfig)
 
 	opts := modelOpts(config.BackendConfig{}, appConfig, []model.Option{
 		model.WithBackendString(bb),
diff --git a/core/backend/soundgeneration.go b/core/backend/soundgeneration.go
index abd5221b..b6a1c827 100644
--- a/core/backend/soundgeneration.go
+++ b/core/backend/soundgeneration.go
@@ -29,7 +29,7 @@ func SoundGeneration(
 		return "", nil, fmt.Errorf("backend is a required parameter")
 	}
 
-	grpcOpts := gRPCModelOpts(backendConfig)
+	grpcOpts := GRPCModelOpts(backendConfig)
 	opts := modelOpts(config.BackendConfig{}, appConfig, []model.Option{
 		model.WithBackendString(backend),
 		model.WithModel(modelFile),
diff --git a/core/backend/tts.go b/core/backend/tts.go
index 258882ae..2401748c 100644
--- a/core/backend/tts.go
+++ b/core/backend/tts.go
@@ -28,7 +28,7 @@ func ModelTTS(
 		bb = model.PiperBackend
 	}
 
-	grpcOpts := gRPCModelOpts(backendConfig)
+	grpcOpts := GRPCModelOpts(backendConfig)
 
 	opts := modelOpts(config.BackendConfig{}, appConfig, []model.Option{
 		model.WithBackendString(bb),
diff --git a/core/cli/run.go b/core/cli/run.go
index afb7204c..a67839a0 100644
--- a/core/cli/run.go
+++ b/core/cli/run.go
@@ -69,6 +69,7 @@ type RunCMD struct {
 	WatchdogBusyTimeout                string   `env:"LOCALAI_WATCHDOG_BUSY_TIMEOUT,WATCHDOG_BUSY_TIMEOUT" default:"5m" help:"Threshold beyond which a busy backend should be stopped" group:"backends"`
 	Federated                          bool     `env:"LOCALAI_FEDERATED,FEDERATED" help:"Enable federated instance" group:"federated"`
 	DisableGalleryEndpoint             bool     `env:"LOCALAI_DISABLE_GALLERY_ENDPOINT,DISABLE_GALLERY_ENDPOINT" help:"Disable the gallery endpoints" group:"api"`
+	LoadToMemory                       []string `env:"LOCALAI_LOAD_TO_MEMORY,LOAD_TO_MEMORY" help:"A list of models to load into memory at startup" group:"models"`
 }
 
 func (r *RunCMD) Run(ctx *cliContext.Context) error {
@@ -104,6 +105,7 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error {
 		config.WithDisableApiKeyRequirementForHttpGet(r.DisableApiKeyRequirementForHttpGet),
 		config.WithHttpGetExemptedEndpoints(r.HttpGetExemptedEndpoints),
 		config.WithP2PNetworkID(r.Peer2PeerNetworkID),
+		config.WithLoadToMemory(r.LoadToMemory),
 	}
 
 	token := ""
diff --git a/core/config/application_config.go b/core/config/application_config.go
index afbf325f..2af0c7ae 100644
--- a/core/config/application_config.go
+++ b/core/config/application_config.go
@@ -41,6 +41,7 @@ type ApplicationConfig struct {
 	DisableApiKeyRequirementForHttpGet bool
 	HttpGetExemptedEndpoints           []*regexp.Regexp
 	DisableGalleryEndpoint             bool
+	LoadToMemory                       []string
 
 	ModelLibraryURL string
 
@@ -331,6 +332,12 @@ func WithOpaqueErrors(opaque bool) AppOption {
 	}
 }
 
+func WithLoadToMemory(models []string) AppOption {
+	return func(o *ApplicationConfig) {
+		o.LoadToMemory = models
+	}
+}
+
 func WithSubtleKeyComparison(subtle bool) AppOption {
 	return func(o *ApplicationConfig) {
 		o.UseSubtleKeyComparison = subtle
diff --git a/core/startup/startup.go b/core/startup/startup.go
index 3565d196..b7b9ce8f 100644
--- a/core/startup/startup.go
+++ b/core/startup/startup.go
@@ -1,206 +1,243 @@
-package startup
-
-import (
-	"fmt"
-	"os"
-
-	"github.com/mudler/LocalAI/core"
-	"github.com/mudler/LocalAI/core/config"
-	"github.com/mudler/LocalAI/core/services"
-	"github.com/mudler/LocalAI/internal"
-	"github.com/mudler/LocalAI/pkg/assets"
-	"github.com/mudler/LocalAI/pkg/library"
-	"github.com/mudler/LocalAI/pkg/model"
-	pkgStartup "github.com/mudler/LocalAI/pkg/startup"
-	"github.com/mudler/LocalAI/pkg/xsysinfo"
-	"github.com/rs/zerolog/log"
-)
-
-func Startup(opts ...config.AppOption) (*config.BackendConfigLoader, *model.ModelLoader, *config.ApplicationConfig, error) {
-	options := config.NewApplicationConfig(opts...)
-
-	log.Info().Msgf("Starting LocalAI using %d threads, with models path: %s", options.Threads, options.ModelPath)
-	log.Info().Msgf("LocalAI version: %s", internal.PrintableVersion())
-	caps, err := xsysinfo.CPUCapabilities()
-	if err == nil {
-		log.Debug().Msgf("CPU capabilities: %v", caps)
-	}
-	gpus, err := xsysinfo.GPUs()
-	if err == nil {
-		log.Debug().Msgf("GPU count: %d", len(gpus))
-		for _, gpu := range gpus {
-			log.Debug().Msgf("GPU: %s", gpu.String())
-		}
-	}
-
-	// Make sure directories exists
-	if options.ModelPath == "" {
-		return nil, nil, nil, fmt.Errorf("options.ModelPath cannot be empty")
-	}
-	err = os.MkdirAll(options.ModelPath, 0750)
-	if err != nil {
-		return nil, nil, nil, fmt.Errorf("unable to create ModelPath: %q", err)
-	}
-	if options.ImageDir != "" {
-		err := os.MkdirAll(options.ImageDir, 0750)
-		if err != nil {
-			return nil, nil, nil, fmt.Errorf("unable to create ImageDir: %q", err)
-		}
-	}
-	if options.AudioDir != "" {
-		err := os.MkdirAll(options.AudioDir, 0750)
-		if err != nil {
-			return nil, nil, nil, fmt.Errorf("unable to create AudioDir: %q", err)
-		}
-	}
-	if options.UploadDir != "" {
-		err := os.MkdirAll(options.UploadDir, 0750)
-		if err != nil {
-			return nil, nil, nil, fmt.Errorf("unable to create UploadDir: %q", err)
-		}
-	}
-
-	if err := pkgStartup.InstallModels(options.Galleries, options.ModelLibraryURL, options.ModelPath, options.EnforcePredownloadScans, nil, options.ModelsURL...); err != nil {
-		log.Error().Err(err).Msg("error installing models")
-	}
-
-	cl := config.NewBackendConfigLoader(options.ModelPath)
-	ml := model.NewModelLoader(options.ModelPath)
-
-	configLoaderOpts := options.ToConfigLoaderOptions()
-
-	if err := cl.LoadBackendConfigsFromPath(options.ModelPath, configLoaderOpts...); err != nil {
-		log.Error().Err(err).Msg("error loading config files")
-	}
-
-	if options.ConfigFile != "" {
-		if err := cl.LoadMultipleBackendConfigsSingleFile(options.ConfigFile, configLoaderOpts...); err != nil {
-			log.Error().Err(err).Msg("error loading config file")
-		}
-	}
-
-	if err := cl.Preload(options.ModelPath); err != nil {
-		log.Error().Err(err).Msg("error downloading models")
-	}
-
-	if options.PreloadJSONModels != "" {
-		if err := services.ApplyGalleryFromString(options.ModelPath, options.PreloadJSONModels, options.EnforcePredownloadScans, options.Galleries); err != nil {
-			return nil, nil, nil, err
-		}
-	}
-
-	if options.PreloadModelsFromPath != "" {
-		if err := services.ApplyGalleryFromFile(options.ModelPath, options.PreloadModelsFromPath, options.EnforcePredownloadScans, options.Galleries); err != nil {
-			return nil, nil, nil, err
-		}
-	}
-
-	if options.Debug {
-		for _, v := range cl.GetAllBackendConfigs() {
-			log.Debug().Msgf("Model: %s (config: %+v)", v.Name, v)
-		}
-	}
-
-	if options.AssetsDestination != "" {
-		// Extract files from the embedded FS
-		err := assets.ExtractFiles(options.BackendAssets, options.AssetsDestination)
-		log.Debug().Msgf("Extracting backend assets files to %s", options.AssetsDestination)
-		if err != nil {
-			log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly)", err)
-		}
-	}
-
-	if options.LibPath != "" {
-		// If there is a lib directory, set LD_LIBRARY_PATH to include it
-		err := library.LoadExternal(options.LibPath)
-		if err != nil {
-			log.Error().Err(err).Str("LibPath", options.LibPath).Msg("Error while loading external libraries")
-		}
-	}
-
-	// turn off any process that was started by GRPC if the context is canceled
-	go func() {
-		<-options.Context.Done()
-		log.Debug().Msgf("Context canceled, shutting down")
-		err := ml.StopAllGRPC()
-		if err != nil {
-			log.Error().Err(err).Msg("error while stopping all grpc backends")
-		}
-	}()
-
-	if options.WatchDog {
-		wd := model.NewWatchDog(
-			ml,
-			options.WatchDogBusyTimeout,
-			options.WatchDogIdleTimeout,
-			options.WatchDogBusy,
-			options.WatchDogIdle)
-		ml.SetWatchDog(wd)
-		go wd.Run()
-		go func() {
-			<-options.Context.Done()
-			log.Debug().Msgf("Context canceled, shutting down")
-			wd.Shutdown()
-		}()
-	}
-
-	// Watch the configuration directory
-	startWatcher(options)
-
-	log.Info().Msg("core/startup process completed!")
-	return cl, ml, options, nil
-}
-
-func startWatcher(options *config.ApplicationConfig) {
-	if options.DynamicConfigsDir == "" {
-		// No need to start the watcher if the directory is not set
-		return
-	}
-
-	if _, err := os.Stat(options.DynamicConfigsDir); err != nil {
-		if os.IsNotExist(err) {
-			// We try to create the directory if it does not exist and was specified
-			if err := os.MkdirAll(options.DynamicConfigsDir, 0700); err != nil {
-				log.Error().Err(err).Msg("failed creating DynamicConfigsDir")
-			}
-		} else {
-			// something else happened, we log the error and don't start the watcher
-			log.Error().Err(err).Msg("failed to read DynamicConfigsDir, watcher will not be started")
-			return
-		}
-	}
-
-	configHandler := newConfigFileHandler(options)
-	if err := configHandler.Watch(); err != nil {
-		log.Error().Err(err).Msg("failed creating watcher")
-	}
-}
-
-// In Lieu of a proper DI framework, this function wires up the Application manually.
-// This is in core/startup rather than core/state.go to keep package references clean!
-func createApplication(appConfig *config.ApplicationConfig) *core.Application {
-	app := &core.Application{
-		ApplicationConfig:   appConfig,
-		BackendConfigLoader: config.NewBackendConfigLoader(appConfig.ModelPath),
-		ModelLoader:         model.NewModelLoader(appConfig.ModelPath),
-	}
-
-	var err error
-
-	// app.EmbeddingsBackendService = backend.NewEmbeddingsBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig)
-	// app.ImageGenerationBackendService = backend.NewImageGenerationBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig)
-	// app.LLMBackendService = backend.NewLLMBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig)
-	// app.TranscriptionBackendService = backend.NewTranscriptionBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig)
-	// app.TextToSpeechBackendService = backend.NewTextToSpeechBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig)
-
-	app.BackendMonitorService = services.NewBackendMonitorService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig)
-	app.GalleryService = services.NewGalleryService(app.ApplicationConfig)
-	// app.OpenAIService = services.NewOpenAIService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig, app.LLMBackendService)
-
-	app.LocalAIMetricsService, err = services.NewLocalAIMetricsService()
-	if err != nil {
-		log.Error().Err(err).Msg("encountered an error initializing metrics service, startup will continue but metrics will not be tracked.")
-	}
-
-	return app
-}
+package startup
+
+import (
+	"fmt"
+	"os"
+
+	"github.com/mudler/LocalAI/core"
+	"github.com/mudler/LocalAI/core/backend"
+	"github.com/mudler/LocalAI/core/config"
+	"github.com/mudler/LocalAI/core/services"
+	"github.com/mudler/LocalAI/internal"
+	"github.com/mudler/LocalAI/pkg/assets"
+	"github.com/mudler/LocalAI/pkg/library"
+	"github.com/mudler/LocalAI/pkg/model"
+	pkgStartup "github.com/mudler/LocalAI/pkg/startup"
+	"github.com/mudler/LocalAI/pkg/xsysinfo"
+	"github.com/rs/zerolog/log"
+)
+
+func Startup(opts ...config.AppOption) (*config.BackendConfigLoader, *model.ModelLoader, *config.ApplicationConfig, error) {
+	options := config.NewApplicationConfig(opts...)
+
+	log.Info().Msgf("Starting LocalAI using %d threads, with models path: %s", options.Threads, options.ModelPath)
+	log.Info().Msgf("LocalAI version: %s", internal.PrintableVersion())
+	caps, err := xsysinfo.CPUCapabilities()
+	if err == nil {
+		log.Debug().Msgf("CPU capabilities: %v", caps)
+	}
+	gpus, err := xsysinfo.GPUs()
+	if err == nil {
+		log.Debug().Msgf("GPU count: %d", len(gpus))
+		for _, gpu := range gpus {
+			log.Debug().Msgf("GPU: %s", gpu.String())
+		}
+	}
+
+	// Make sure directories exists
+	if options.ModelPath == "" {
+		return nil, nil, nil, fmt.Errorf("options.ModelPath cannot be empty")
+	}
+	err = os.MkdirAll(options.ModelPath, 0750)
+	if err != nil {
+		return nil, nil, nil, fmt.Errorf("unable to create ModelPath: %q", err)
+	}
+	if options.ImageDir != "" {
+		err := os.MkdirAll(options.ImageDir, 0750)
+		if err != nil {
+			return nil, nil, nil, fmt.Errorf("unable to create ImageDir: %q", err)
+		}
+	}
+	if options.AudioDir != "" {
+		err := os.MkdirAll(options.AudioDir, 0750)
+		if err != nil {
+			return nil, nil, nil, fmt.Errorf("unable to create AudioDir: %q", err)
+		}
+	}
+	if options.UploadDir != "" {
+		err := os.MkdirAll(options.UploadDir, 0750)
+		if err != nil {
+			return nil, nil, nil, fmt.Errorf("unable to create UploadDir: %q", err)
+		}
+	}
+
+	if err := pkgStartup.InstallModels(options.Galleries, options.ModelLibraryURL, options.ModelPath, options.EnforcePredownloadScans, nil, options.ModelsURL...); err != nil {
+		log.Error().Err(err).Msg("error installing models")
+	}
+
+	cl := config.NewBackendConfigLoader(options.ModelPath)
+	ml := model.NewModelLoader(options.ModelPath)
+
+	configLoaderOpts := options.ToConfigLoaderOptions()
+
+	if err := cl.LoadBackendConfigsFromPath(options.ModelPath, configLoaderOpts...); err != nil {
+		log.Error().Err(err).Msg("error loading config files")
+	}
+
+	if options.ConfigFile != "" {
+		if err := cl.LoadMultipleBackendConfigsSingleFile(options.ConfigFile, configLoaderOpts...); err != nil {
+			log.Error().Err(err).Msg("error loading config file")
+		}
+	}
+
+	if err := cl.Preload(options.ModelPath); err != nil {
+		log.Error().Err(err).Msg("error downloading models")
+	}
+
+	if options.PreloadJSONModels != "" {
+		if err := services.ApplyGalleryFromString(options.ModelPath, options.PreloadJSONModels, options.EnforcePredownloadScans, options.Galleries); err != nil {
+			return nil, nil, nil, err
+		}
+	}
+
+	if options.PreloadModelsFromPath != "" {
+		if err := services.ApplyGalleryFromFile(options.ModelPath, options.PreloadModelsFromPath, options.EnforcePredownloadScans, options.Galleries); err != nil {
+			return nil, nil, nil, err
+		}
+	}
+
+	if options.Debug {
+		for _, v := range cl.GetAllBackendConfigs() {
+			log.Debug().Msgf("Model: %s (config: %+v)", v.Name, v)
+		}
+	}
+
+	if options.AssetsDestination != "" {
+		// Extract files from the embedded FS
+		err := assets.ExtractFiles(options.BackendAssets, options.AssetsDestination)
+		log.Debug().Msgf("Extracting backend assets files to %s", options.AssetsDestination)
+		if err != nil {
+			log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly)", err)
+		}
+	}
+
+	if options.LibPath != "" {
+		// If there is a lib directory, set LD_LIBRARY_PATH to include it
+		err := library.LoadExternal(options.LibPath)
+		if err != nil {
+			log.Error().Err(err).Str("LibPath", options.LibPath).Msg("Error while loading external libraries")
+		}
+	}
+
+	// turn off any process that was started by GRPC if the context is canceled
+	go func() {
+		<-options.Context.Done()
+		log.Debug().Msgf("Context canceled, shutting down")
+		err := ml.StopAllGRPC()
+		if err != nil {
+			log.Error().Err(err).Msg("error while stopping all grpc backends")
+		}
+	}()
+
+	if options.WatchDog {
+		wd := model.NewWatchDog(
+			ml,
+			options.WatchDogBusyTimeout,
+			options.WatchDogIdleTimeout,
+			options.WatchDogBusy,
+			options.WatchDogIdle)
+		ml.SetWatchDog(wd)
+		go wd.Run()
+		go func() {
+			<-options.Context.Done()
+			log.Debug().Msgf("Context canceled, shutting down")
+			wd.Shutdown()
+		}()
+	}
+
+	if options.LoadToMemory != nil {
+		for _, m := range options.LoadToMemory {
+			cfg, err := cl.LoadBackendConfigFileByName(m, options.ModelPath,
+				config.LoadOptionDebug(options.Debug),
+				config.LoadOptionThreads(options.Threads),
+				config.LoadOptionContextSize(options.ContextSize),
+				config.LoadOptionF16(options.F16),
+				config.ModelPath(options.ModelPath),
+			)
+			if err != nil {
+				return nil, nil, nil, err
+			}
+
+			log.Debug().Msgf("Auto loading model %s into memory from file: %s", m, cfg.Model)
+
+			grpcOpts := backend.GRPCModelOpts(*cfg)
+			o := []model.Option{
+				model.WithModel(cfg.Model),
+				model.WithAssetDir(options.AssetsDestination),
+				model.WithThreads(uint32(options.Threads)),
+				model.WithLoadGRPCLoadModelOpts(grpcOpts),
+			}
+
+			var backendErr error
+			if cfg.Backend != "" {
+				o = append(o, model.WithBackendString(cfg.Backend))
+				_, backendErr = ml.BackendLoader(o...)
+			} else {
+				_, backendErr = ml.GreedyLoader(o...)
+			}
+			if backendErr != nil {
+				return nil, nil, nil, err
+			}
+		}
+	}
+
+	// Watch the configuration directory
+	startWatcher(options)
+
+	log.Info().Msg("core/startup process completed!")
+	return cl, ml, options, nil
+}
+
+func startWatcher(options *config.ApplicationConfig) {
+	if options.DynamicConfigsDir == "" {
+		// No need to start the watcher if the directory is not set
+		return
+	}
+
+	if _, err := os.Stat(options.DynamicConfigsDir); err != nil {
+		if os.IsNotExist(err) {
+			// We try to create the directory if it does not exist and was specified
+			if err := os.MkdirAll(options.DynamicConfigsDir, 0700); err != nil {
+				log.Error().Err(err).Msg("failed creating DynamicConfigsDir")
+			}
+		} else {
+			// something else happened, we log the error and don't start the watcher
+			log.Error().Err(err).Msg("failed to read DynamicConfigsDir, watcher will not be started")
+			return
+		}
+	}
+
+	configHandler := newConfigFileHandler(options)
+	if err := configHandler.Watch(); err != nil {
+		log.Error().Err(err).Msg("failed creating watcher")
+	}
+}
+
+// In Lieu of a proper DI framework, this function wires up the Application manually.
+// This is in core/startup rather than core/state.go to keep package references clean!
+func createApplication(appConfig *config.ApplicationConfig) *core.Application {
+	app := &core.Application{
+		ApplicationConfig:   appConfig,
+		BackendConfigLoader: config.NewBackendConfigLoader(appConfig.ModelPath),
+		ModelLoader:         model.NewModelLoader(appConfig.ModelPath),
+	}
+
+	var err error
+
+	// app.EmbeddingsBackendService = backend.NewEmbeddingsBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig)
+	// app.ImageGenerationBackendService = backend.NewImageGenerationBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig)
+	// app.LLMBackendService = backend.NewLLMBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig)
+	// app.TranscriptionBackendService = backend.NewTranscriptionBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig)
+	// app.TextToSpeechBackendService = backend.NewTextToSpeechBackendService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig)
+
+	app.BackendMonitorService = services.NewBackendMonitorService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig)
+	app.GalleryService = services.NewGalleryService(app.ApplicationConfig)
+	// app.OpenAIService = services.NewOpenAIService(app.ModelLoader, app.BackendConfigLoader, app.ApplicationConfig, app.LLMBackendService)
+
+	app.LocalAIMetricsService, err = services.NewLocalAIMetricsService()
+	if err != nil {
+		log.Error().Err(err).Msg("encountered an error initializing metrics service, startup will continue but metrics will not be tracked.")
+	}
+
+	return app
+}

From 9bd7f3f995c6a9c9d8e4cab49cb1970a70629efc Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Sun, 22 Sep 2024 10:04:20 +0200
Subject: [PATCH 061/122] feat(coqui): switch to maintained community fork
 (#3625)

Fixes: https://github.com/mudler/LocalAI/issues/2513

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 backend/python/coqui/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/coqui/requirements.txt b/backend/python/coqui/requirements.txt
index d7708363..2a91f2b9 100644
--- a/backend/python/coqui/requirements.txt
+++ b/backend/python/coqui/requirements.txt
@@ -1,4 +1,4 @@
-TTS==0.22.0
+coqui-tts
 grpcio==1.66.1
 protobuf
 certifi
\ No newline at end of file

From 56f4deb938ee045b2df3b517b7e25c28df252ef5 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Sun, 22 Sep 2024 15:19:38 +0200
Subject: [PATCH 062/122] chore(ci): split hipblas jobs

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---
 .github/workflows/image.yml | 115 ++++++++++++++++++++++--------------
 1 file changed, 72 insertions(+), 43 deletions(-)

diff --git a/.github/workflows/image.yml b/.github/workflows/image.yml
index 395d7761..f57cf770 100644
--- a/.github/workflows/image.yml
+++ b/.github/workflows/image.yml
@@ -13,6 +13,78 @@ concurrency:
   cancel-in-progress: true
 
 jobs:
+  hipblas-jobs:
+    uses: ./.github/workflows/image_build.yml
+    with:
+      tag-latest: ${{ matrix.tag-latest }}
+      tag-suffix: ${{ matrix.tag-suffix }}
+      ffmpeg: ${{ matrix.ffmpeg }}
+      image-type: ${{ matrix.image-type }}
+      build-type: ${{ matrix.build-type }}
+      cuda-major-version: ${{ matrix.cuda-major-version }}
+      cuda-minor-version: ${{ matrix.cuda-minor-version }}
+      platforms: ${{ matrix.platforms }}
+      runs-on: ${{ matrix.runs-on }}
+      base-image: ${{ matrix.base-image }}
+      grpc-base-image: ${{ matrix.grpc-base-image }}
+      aio: ${{ matrix.aio }}
+      makeflags: ${{ matrix.makeflags }}
+      latest-image: ${{ matrix.latest-image }}
+      latest-image-aio: ${{ matrix.latest-image-aio }}
+    secrets:
+      dockerUsername: ${{ secrets.DOCKERHUB_USERNAME }}
+      dockerPassword: ${{ secrets.DOCKERHUB_PASSWORD }}
+      quayUsername: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
+      quayPassword: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
+    strategy:
+      # Pushing with all jobs in parallel
+      # eats the bandwidth of all the nodes
+      max-parallel: 1
+      matrix:
+        include:
+          - build-type: 'hipblas'
+            platforms: 'linux/amd64'
+            tag-latest: 'auto'
+            tag-suffix: '-hipblas-ffmpeg'
+            ffmpeg: 'true'
+            image-type: 'extras'
+            aio: "-aio-gpu-hipblas"
+            base-image: "rocm/dev-ubuntu-22.04:6.1"
+            grpc-base-image: "ubuntu:22.04"
+            latest-image: 'latest-gpu-hipblas'
+            latest-image-aio: 'latest-aio-gpu-hipblas'
+            runs-on: 'arc-runner-set'
+            makeflags: "--jobs=3 --output-sync=target"
+          - build-type: 'hipblas'
+            platforms: 'linux/amd64'
+            tag-latest: 'false'
+            tag-suffix: '-hipblas'
+            ffmpeg: 'false'
+            image-type: 'extras'
+            base-image: "rocm/dev-ubuntu-22.04:6.1"
+            grpc-base-image: "ubuntu:22.04"
+            runs-on: 'arc-runner-set'
+            makeflags: "--jobs=3 --output-sync=target"
+          - build-type: 'hipblas'
+            platforms: 'linux/amd64'
+            tag-latest: 'false'
+            tag-suffix: '-hipblas-ffmpeg-core'
+            ffmpeg: 'true'
+            image-type: 'core'
+            base-image: "rocm/dev-ubuntu-22.04:6.1"
+            grpc-base-image: "ubuntu:22.04"
+            runs-on: 'arc-runner-set'
+            makeflags: "--jobs=3 --output-sync=target"
+          - build-type: 'hipblas'
+            platforms: 'linux/amd64'
+            tag-latest: 'false'
+            tag-suffix: '-hipblas-core'
+            ffmpeg: 'false'
+            image-type: 'core'
+            base-image: "rocm/dev-ubuntu-22.04:6.1"
+            grpc-base-image: "ubuntu:22.04"
+            runs-on: 'arc-runner-set'
+            makeflags: "--jobs=3 --output-sync=target"
   self-hosted-jobs:
     uses: ./.github/workflows/image_build.yml
     with:
@@ -122,29 +194,6 @@ jobs:
             base-image: "ubuntu:22.04"
             runs-on: 'arc-runner-set'
             makeflags: "--jobs=3 --output-sync=target"
-          - build-type: 'hipblas'
-            platforms: 'linux/amd64'
-            tag-latest: 'auto'
-            tag-suffix: '-hipblas-ffmpeg'
-            ffmpeg: 'true'
-            image-type: 'extras'
-            aio: "-aio-gpu-hipblas"
-            base-image: "rocm/dev-ubuntu-22.04:6.1"
-            grpc-base-image: "ubuntu:22.04"
-            latest-image: 'latest-gpu-hipblas'
-            latest-image-aio: 'latest-aio-gpu-hipblas'
-            runs-on: 'arc-runner-set'
-            makeflags: "--jobs=3 --output-sync=target"
-          - build-type: 'hipblas'
-            platforms: 'linux/amd64'
-            tag-latest: 'false'
-            tag-suffix: '-hipblas'
-            ffmpeg: 'false'
-            image-type: 'extras'
-            base-image: "rocm/dev-ubuntu-22.04:6.1"
-            grpc-base-image: "ubuntu:22.04"
-            runs-on: 'arc-runner-set'
-            makeflags: "--jobs=3 --output-sync=target"
           - build-type: 'sycl_f16'
             platforms: 'linux/amd64'
             tag-latest: 'auto'
@@ -212,26 +261,6 @@ jobs:
             image-type: 'core'
             runs-on: 'arc-runner-set'
             makeflags: "--jobs=3 --output-sync=target"
-          - build-type: 'hipblas'
-            platforms: 'linux/amd64'
-            tag-latest: 'false'
-            tag-suffix: '-hipblas-ffmpeg-core'
-            ffmpeg: 'true'
-            image-type: 'core'
-            base-image: "rocm/dev-ubuntu-22.04:6.1"
-            grpc-base-image: "ubuntu:22.04"
-            runs-on: 'arc-runner-set'
-            makeflags: "--jobs=3 --output-sync=target"
-          - build-type: 'hipblas'
-            platforms: 'linux/amd64'
-            tag-latest: 'false'
-            tag-suffix: '-hipblas-core'
-            ffmpeg: 'false'
-            image-type: 'core'
-            base-image: "rocm/dev-ubuntu-22.04:6.1"
-            grpc-base-image: "ubuntu:22.04"
-            runs-on: 'arc-runner-set'
-            makeflags: "--jobs=3 --output-sync=target"
 
   core-image-build:
     uses: ./.github/workflows/image_build.yml

From fd70a22196ffc430e286c14e65497dd22f9d3b63 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Sun, 22 Sep 2024 15:21:16 +0200
Subject: [PATCH 063/122] chore(ci): adjust parallel jobs

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---
 .github/workflows/image.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/image.yml b/.github/workflows/image.yml
index f57cf770..8709f05c 100644
--- a/.github/workflows/image.yml
+++ b/.github/workflows/image.yml
@@ -111,7 +111,7 @@ jobs:
     strategy:
       # Pushing with all jobs in parallel
       # eats the bandwidth of all the nodes
-      max-parallel: ${{ github.event_name != 'pull_request' && 6 || 10 }}
+      max-parallel: ${{ github.event_name != 'pull_request' && 5 || 8 }}
       matrix:
         include:
           # Extra images

From 4edd8c80b407ea415e4cbede6386f8d17efa8f8f Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Sun, 22 Sep 2024 23:41:34 +0200
Subject: [PATCH 064/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `c35e586ea57221844442c65a1172498c54971cb0` (#3629)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 51755e71..fe086645 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=d09770cae71b416c032ec143dda530f7413c4038
+CPPLLAMA_VERSION?=c35e586ea57221844442c65a1172498c54971cb0
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From 3e8e71f8b68f9ea843f57f5bebb9aad32700e0ac Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Mon, 23 Sep 2024 10:56:10 +0200
Subject: [PATCH 065/122] fix(ci): fixup checksum scanning pipeline (#3631)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 .github/check_and_update.py | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/.github/check_and_update.py b/.github/check_and_update.py
index dcf1d04a..704b658e 100644
--- a/.github/check_and_update.py
+++ b/.github/check_and_update.py
@@ -29,9 +29,14 @@ def calculate_sha256(file_path):
 def manual_safety_check_hf(repo_id):
     scanResponse = requests.get('https://huggingface.co/api/models/' + repo_id + "/scan")
     scan = scanResponse.json()
-    if scan['hasUnsafeFile']:
-        return scan
-    return None
+    # Check if 'hasUnsafeFile' exists in the response
+    if 'hasUnsafeFile' in scan:
+        if scan['hasUnsafeFile']:
+            return scan
+        else:
+            return None
+    else:
+        return None
 
 download_type, repo_id_or_url = parse_uri(uri)
 

From 51cba89682b40ae92737fa47ce6bdbce9ba8cac6 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Mon, 23 Sep 2024 11:49:07 +0200
Subject: [PATCH 066/122] fix(hipblas): do not push all variants to hipblas
 builds (#3630)

Like with CUDA builds, we don't need all the variants when we are
compiling against the accelerated variants - in this way we save space
and we avoid to exceed embedFS golang size limits.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 Dockerfile | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Dockerfile b/Dockerfile
index f08cb9a0..323c3d9a 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -297,10 +297,10 @@ COPY .git .
 RUN make prepare
 
 ## Build the binary
-## If it's CUDA, we want to skip some of the llama-compat backends to save space
-## We only leave the most CPU-optimized variant and the fallback for the cublas build
-## (both will use CUDA for the actual computation)
-RUN if [ "${BUILD_TYPE}" = "cublas" ]; then \
+## If it's CUDA or hipblas, we want to skip some of the llama-compat backends to save space
+## We only leave the most CPU-optimized variant and the fallback for the cublas/hipblas build
+## (both will use CUDA or hipblas for the actual computation)
+RUN if [ "${BUILD_TYPE}" = "cublas" ] || [ "${BUILD_TYPE}" = "hipblas" ]; then \
         SKIP_GRPC_BACKEND="backend-assets/grpc/llama-cpp-avx backend-assets/grpc/llama-cpp-avx2" make build; \
     else \
         make build; \

From bf8f8671d1b1daae8f1a1f446ab8f6366ddb4396 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Mon, 23 Sep 2024 19:04:36 +0200
Subject: [PATCH 067/122] chore(ci): adjust parallelism

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---
 .github/workflows/image.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/image.yml b/.github/workflows/image.yml
index 8709f05c..6db8bb07 100644
--- a/.github/workflows/image.yml
+++ b/.github/workflows/image.yml
@@ -39,7 +39,7 @@ jobs:
     strategy:
       # Pushing with all jobs in parallel
       # eats the bandwidth of all the nodes
-      max-parallel: 1
+      max-parallel: 2
       matrix:
         include:
           - build-type: 'hipblas'

From 1da8d8b9db431a62756dd2976d00531b316b0dfa Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Mon, 23 Sep 2024 19:09:51 +0200
Subject: [PATCH 068/122] models(gallery): add nightygurps-14b-v1.1 (#3633)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 7dab9eb7..1b84c403 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -654,6 +654,22 @@
     - filename: Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf
       sha256: 9369eb97922a9f01e4eae610e3d7aaeca30762d78d9239884179451d60bdbdd2
       uri: huggingface://bartowski/Llama3.1-8B-ShiningValiant2-GGUF/Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf
+- !!merge <<: *llama31
+  name: "nightygurps-14b-v1.1"
+  icon: https://cdn-uploads.huggingface.co/production/uploads/6336c5b3e3ac69e6a90581da/FvfjK7bKqsWdaBkB3eWgP.png
+  urls:
+    - https://huggingface.co/AlexBefest/NightyGurps-14b-v1.1
+    - https://huggingface.co/bartowski/NightyGurps-14b-v1.1-GGUF
+  description: |
+    This model works with Russian only.
+    This model is designed to run GURPS roleplaying games, as well as consult and assist. This model was trained on an augmented dataset of the GURPS Basic Set rulebook. Its primary purpose was initially to become an assistant consultant and assistant Game Master for the GURPS roleplaying system, but it can also be used as a GM for running solo games as a player.
+  overrides:
+    parameters:
+      model: NightyGurps-14b-v1.1-Q4_K_M.gguf
+  files:
+    - filename: NightyGurps-14b-v1.1-Q4_K_M.gguf
+      sha256: d09d53259ad2c0298150fa8c2db98fe42f11731af89fdc80ad0e255a19adc4b0
+      uri: huggingface://bartowski/NightyGurps-14b-v1.1-GGUF/NightyGurps-14b-v1.1-Q4_K_M.gguf
 ## Uncensored models
 - !!merge <<: *llama31
   name: "humanish-roleplay-llama-3.1-8b-i1"

From 26d99ed1c714652ae118e27768273b5b98e7bbf4 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Mon, 23 Sep 2024 19:12:54 +0200
Subject: [PATCH 069/122] models(gallery): add gemma-2-9b-arliai-rpmax-v1.1
 (#3634)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 1b84c403..bddd6b16 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -2109,6 +2109,20 @@
     - filename: Buddy-2B-v1-Q4_K_M.gguf
       sha256: 9bd25ed907d1a3c2e07fe09399a9b3aec107d368c29896e2c46facede5b7e3d5
       uri: huggingface://bartowski/Buddy-2B-v1-GGUF/Buddy-2B-v1-Q4_K_M.gguf
+- !!merge <<: *gemma
+  name: "gemma-2-9b-arliai-rpmax-v1.1"
+  urls:
+    - https://huggingface.co/ArliAI/Gemma-2-9B-ArliAI-RPMax-v1.1
+    - https://huggingface.co/bartowski/Gemma-2-9B-ArliAI-RPMax-v1.1-GGUF
+  description: |
+    RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations.
+  overrides:
+    parameters:
+      model: Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
+  files:
+    - filename: Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
+      sha256: 1724aff0ad6f71bf4371d839aca55578f7ec6f030d8d25c0254126088e4c6250
+      uri: huggingface://bartowski/Gemma-2-9B-ArliAI-RPMax-v1.1-GGUF/Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
 - &llama3
   url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master"
   icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png

From e332ff80660fd3f23ecf67acd2807d22c9cafc85 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Mon, 23 Sep 2024 19:16:41 +0200
Subject: [PATCH 070/122] models(gallery): add gemma-2-2b-arliai-rpmax-v1.1
 (#3635)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index bddd6b16..f75e448c 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -2123,6 +2123,19 @@
     - filename: Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
       sha256: 1724aff0ad6f71bf4371d839aca55578f7ec6f030d8d25c0254126088e4c6250
       uri: huggingface://bartowski/Gemma-2-9B-ArliAI-RPMax-v1.1-GGUF/Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
+- !!merge <<: *gemma
+  name: "gemma-2-2b-arliai-rpmax-v1.1"
+  urls:
+    - https://huggingface.co/bartowski/Gemma-2-2B-ArliAI-RPMax-v1.1-GGUF
+  description: |
+    RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations.
+  overrides:
+    parameters:
+      model: Gemma-2-2B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
+  files:
+    - filename: Gemma-2-2B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
+      sha256: 89fe35345754d7e9de8d0c0d5bf35b2be9b12a09811b365b712b8b27112f7712
+      uri: huggingface://bartowski/Gemma-2-2B-ArliAI-RPMax-v1.1-GGUF/Gemma-2-2B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
 - &llama3
   url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master"
   icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png

From bbdf78615e72a8dfd5e80b9e1db1c804741fb4e5 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Mon, 23 Sep 2024 19:24:14 +0200
Subject: [PATCH 071/122] models(gallery): add acolyte-22b-i1 (#3636)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index f75e448c..9b8a0220 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -1523,6 +1523,21 @@
     - filename: Pantheon-RP-1.6-12b-Nemo-Q4_K_M.gguf
       sha256: cf3465c183bf4ecbccd1b6b480f687e0160475b04c87e2f1e5ebc8baa0f4c7aa
       uri: huggingface://bartowski/Pantheon-RP-1.6-12b-Nemo-GGUF/Pantheon-RP-1.6-12b-Nemo-Q4_K_M.gguf
+- !!merge <<: *mistral03
+  name: "acolyte-22b-i1"
+  icon: https://cdn-uploads.huggingface.co/production/uploads/6569a4ed2419be6072890cf8/3dcGMcrWK2-2vQh9QBt3o.png
+  urls:
+    - https://huggingface.co/rAIfle/Acolyte-22B
+    - https://huggingface.co/mradermacher/Acolyte-22B-i1-GGUF
+  description: |
+    LoRA of a bunch of random datasets on top of Mistral-Small-Instruct-2409, then SLERPed onto base at 0.5. Decent enough for its size. Check the LoRA for dataset info.
+  overrides:
+    parameters:
+      model: Acolyte-22B.i1-Q4_K_M.gguf
+  files:
+    - filename: Acolyte-22B.i1-Q4_K_M.gguf
+      sha256: 5a454405b98b6f886e8e4c695488d8ea098162bb8c46f2a7723fc2553c6e2f6e
+      uri: huggingface://mradermacher/Acolyte-22B-i1-GGUF/Acolyte-22B.i1-Q4_K_M.gguf
 - !!merge <<: *mistral03
   name: "mn-12b-lyra-v4-iq-imatrix"
   icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/dVoru83WOpwVjMlgZ_xhA.png

From 043cb94436ab44c30f160cc68423aa8915ec800f Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 23 Sep 2024 21:23:21 +0000
Subject: [PATCH 072/122] chore(deps): Bump yarl from 1.11.0 to 1.11.1 in
 /examples/langchain/langchainpy-localai-example (#3643)

chore(deps): Bump yarl

Bumps [yarl](https://github.com/aio-libs/yarl) from 1.11.0 to 1.11.1.
- [Release notes](https://github.com/aio-libs/yarl/releases)
- [Changelog](https://github.com/aio-libs/yarl/blob/master/CHANGES.rst)
- [Commits](https://github.com/aio-libs/yarl/compare/v1.11.0...v1.11.1)

---
updated-dependencies:
- dependency-name: yarl
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index 98325db3..3e4133ca 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -30,4 +30,4 @@ tqdm==4.66.5
 typing-inspect==0.9.0
 typing_extensions==4.12.2
 urllib3==2.2.2
-yarl==1.11.0
+yarl==1.11.1

From cc6fac1688e5f700baa1d460106861abc7a1d2f4 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 24 Sep 2024 01:16:39 +0000
Subject: [PATCH 073/122] chore(deps): Bump urllib3 from 2.2.2 to 2.2.3 in
 /examples/langchain/langchainpy-localai-example (#3646)

chore(deps): Bump urllib3

Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.2.2 to 2.2.3.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.2.2...2.2.3)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index 3e4133ca..675429a3 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -29,5 +29,5 @@ tenacity==8.5.0
 tqdm==4.66.5
 typing-inspect==0.9.0
 typing_extensions==4.12.2
-urllib3==2.2.2
+urllib3==2.2.3
 yarl==1.11.1

From b8e129f2a6541a23f9c0b595ba12daa7e41a5a18 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 24 Sep 2024 02:53:35 +0000
Subject: [PATCH 074/122] chore(deps): Bump idna from 3.8 to 3.10 in
 /examples/langchain/langchainpy-localai-example (#3644)

chore(deps): Bump idna

Bumps [idna](https://github.com/kjd/idna) from 3.8 to 3.10.
- [Release notes](https://github.com/kjd/idna/releases)
- [Changelog](https://github.com/kjd/idna/blob/master/HISTORY.rst)
- [Commits](https://github.com/kjd/idna/compare/v3.8...v3.10)

---
updated-dependencies:
- dependency-name: idna
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index 675429a3..64a43bea 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -9,7 +9,7 @@ dataclasses-json==0.6.7
 debugpy==1.8.2
 frozenlist==1.4.1
 greenlet==3.1.0
-idna==3.8
+idna==3.10
 langchain==0.3.0
 langchain-community==0.2.16
 marshmallow==3.22.0

From c1752cbb831fe9ccb3dd113202884d4f670afbb7 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 24 Sep 2024 04:30:05 +0000
Subject: [PATCH 075/122] chore(deps): Bump sqlalchemy from 2.0.32 to 2.0.35 in
 /examples/langchain/langchainpy-localai-example (#3649)

chore(deps): Bump sqlalchemy

Bumps [sqlalchemy](https://github.com/sqlalchemy/sqlalchemy) from 2.0.32 to 2.0.35.
- [Release notes](https://github.com/sqlalchemy/sqlalchemy/releases)
- [Changelog](https://github.com/sqlalchemy/sqlalchemy/blob/main/CHANGES.rst)
- [Commits](https://github.com/sqlalchemy/sqlalchemy/commits)

---
updated-dependencies:
- dependency-name: sqlalchemy
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index 64a43bea..ac147410 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -24,7 +24,7 @@ packaging>=23.2
 pydantic==2.8.2
 PyYAML==6.0.2
 requests==2.32.3
-SQLAlchemy==2.0.32
+SQLAlchemy==2.0.35
 tenacity==8.5.0
 tqdm==4.66.5
 typing-inspect==0.9.0

From 69d2902b0a6e7647e16092118d73f779d80f266e Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Tue, 24 Sep 2024 09:31:28 +0200
Subject: [PATCH 076/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `f0c7b5edf82aa200656fd88c11ae3a805d7130bf` (#3653)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index fe086645..578656e5 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=c35e586ea57221844442c65a1172498c54971cb0
+CPPLLAMA_VERSION?=f0c7b5edf82aa200656fd88c11ae3a805d7130bf
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From 90cacb9692f3dc374766b0e32f75be8229a47db3 Mon Sep 17 00:00:00 2001
From: Dave <dave@gray101.com>
Date: Tue, 24 Sep 2024 03:32:48 -0400
Subject: [PATCH 077/122] test: preliminary tests and merge fix for authv2
 (#3584)

* add api key to existing app tests, add preliminary auth test

Signed-off-by: Dave Lee <dave@gray101.com>

* small fix, run test

Signed-off-by: Dave Lee <dave@gray101.com>

* status on non-opaque

Signed-off-by: Dave Lee <dave@gray101.com>

* tweak auth error

Signed-off-by: Dave Lee <dave@gray101.com>

* exp

Signed-off-by: Dave Lee <dave@gray101.com>

* quick fix on real laptop

Signed-off-by: Dave Lee <dave@gray101.com>

* add downloader version that allows providing an auth header

Signed-off-by: Dave Lee <dave@gray101.com>

* stash some devcontainer fixes during testing

Signed-off-by: Dave Lee <dave@gray101.com>

* s2

Signed-off-by: Dave Lee <dave@gray101.com>

* s

Signed-off-by: Dave Lee <dave@gray101.com>

* done with experiment

Signed-off-by: Dave Lee <dave@gray101.com>

* done with experiment

Signed-off-by: Dave Lee <dave@gray101.com>

* after merge fix

Signed-off-by: Dave Lee <dave@gray101.com>

* rename and fix

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---
 .devcontainer-scripts/utils.sh |  2 +
 Dockerfile                     |  5 +--
 Makefile                       |  3 ++
 core/gallery/gallery.go        |  4 +-
 core/gallery/models.go         |  2 +-
 core/http/app.go               | 18 ---------
 core/http/app_test.go          | 69 ++++++++++++++++++++++++++++++----
 core/http/middleware/auth.go   |  3 +-
 embedded/embedded.go           |  2 +-
 go.mod                         |  4 +-
 pkg/downloader/uri.go          | 18 +++++++--
 pkg/downloader/uri_test.go     |  6 +--
 12 files changed, 95 insertions(+), 41 deletions(-)

diff --git a/.devcontainer-scripts/utils.sh b/.devcontainer-scripts/utils.sh
index 98ac063c..8416d43d 100644
--- a/.devcontainer-scripts/utils.sh
+++ b/.devcontainer-scripts/utils.sh
@@ -9,6 +9,7 @@
 # Param 2: email
 #
 config_user() {
+    echo "Configuring git for $1 <$2>"
     local gcn=$(git config --global user.name)
     if [ -z "${gcn}" ]; then
         echo "Setting up git user / remote"
@@ -24,6 +25,7 @@ config_user() {
 # Param 2: remote url
 #
 config_remote() {
+    echo "Adding git remote and fetching $2 as $1"
     local gr=$(git remote -v | grep $1)
     if [ -z "${gr}" ]; then
         git remote add $1 $2
diff --git a/Dockerfile b/Dockerfile
index 323c3d9a..8c657469 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -338,9 +338,8 @@ RUN if [ "${FFMPEG}" = "true" ]; then \
 
 RUN apt-get update && \
     apt-get install -y --no-install-recommends \
-        ssh less && \
-    apt-get clean && \
-    rm -rf /var/lib/apt/lists/*
+        ssh less wget
+# For the devcontainer, leave apt functional in case additional devtools are needed at runtime.
 
 RUN go install github.com/go-delve/delve/cmd/dlv@latest
 
diff --git a/Makefile b/Makefile
index 578656e5..7523d5ff 100644
--- a/Makefile
+++ b/Makefile
@@ -359,6 +359,9 @@ clean-tests:
 	rm -rf test-dir
 	rm -rf core/http/backend-assets
 
+clean-dc: clean
+	cp -r /build/backend-assets /workspace/backend-assets
+
 ## Build:
 build: prepare backend-assets grpcs ## Build the project
 	$(info ${GREEN}I local-ai build info:${RESET})
diff --git a/core/gallery/gallery.go b/core/gallery/gallery.go
index 6ced6244..3a60e618 100644
--- a/core/gallery/gallery.go
+++ b/core/gallery/gallery.go
@@ -132,7 +132,7 @@ func AvailableGalleryModels(galleries []config.Gallery, basePath string) ([]*Gal
 func findGalleryURLFromReferenceURL(url string, basePath string) (string, error) {
 	var refFile string
 	uri := downloader.URI(url)
-	err := uri.DownloadAndUnmarshal(basePath, func(url string, d []byte) error {
+	err := uri.DownloadWithCallback(basePath, func(url string, d []byte) error {
 		refFile = string(d)
 		if len(refFile) == 0 {
 			return fmt.Errorf("invalid reference file at url %s: %s", url, d)
@@ -156,7 +156,7 @@ func getGalleryModels(gallery config.Gallery, basePath string) ([]*GalleryModel,
 	}
 	uri := downloader.URI(gallery.URL)
 
-	err := uri.DownloadAndUnmarshal(basePath, func(url string, d []byte) error {
+	err := uri.DownloadWithCallback(basePath, func(url string, d []byte) error {
 		return yaml.Unmarshal(d, &models)
 	})
 	if err != nil {
diff --git a/core/gallery/models.go b/core/gallery/models.go
index dec6312e..58f1963a 100644
--- a/core/gallery/models.go
+++ b/core/gallery/models.go
@@ -69,7 +69,7 @@ type PromptTemplate struct {
 func GetGalleryConfigFromURL(url string, basePath string) (Config, error) {
 	var config Config
 	uri := downloader.URI(url)
-	err := uri.DownloadAndUnmarshal(basePath, func(url string, d []byte) error {
+	err := uri.DownloadWithCallback(basePath, func(url string, d []byte) error {
 		return yaml.Unmarshal(d, &config)
 	})
 	if err != nil {
diff --git a/core/http/app.go b/core/http/app.go
index fa9cd866..23e97f18 100644
--- a/core/http/app.go
+++ b/core/http/app.go
@@ -31,24 +31,6 @@ import (
 	"github.com/rs/zerolog/log"
 )
 
-func readAuthHeader(c *fiber.Ctx) string {
-	authHeader := c.Get("Authorization")
-
-	// elevenlabs
-	xApiKey := c.Get("xi-api-key")
-	if xApiKey != "" {
-		authHeader = "Bearer " + xApiKey
-	}
-
-	// anthropic
-	xApiKey = c.Get("x-api-key")
-	if xApiKey != "" {
-		authHeader = "Bearer " + xApiKey
-	}
-
-	return authHeader
-}
-
 // Embed a directory
 //
 //go:embed static/*
diff --git a/core/http/app_test.go b/core/http/app_test.go
index 86fe7fdd..bbe52c34 100644
--- a/core/http/app_test.go
+++ b/core/http/app_test.go
@@ -31,6 +31,9 @@ import (
 	"github.com/sashabaranov/go-openai/jsonschema"
 )
 
+const apiKey = "joshua"
+const bearerKey = "Bearer " + apiKey
+
 const testPrompt = `### System:
 You are an AI assistant that follows instruction extremely well. Help as much as you can.
 
@@ -50,11 +53,19 @@ type modelApplyRequest struct {
 
 func getModelStatus(url string) (response map[string]interface{}) {
 	// Create the HTTP request
-	resp, err := http.Get(url)
+	req, err := http.NewRequest("GET", url, nil)
+	req.Header.Set("Content-Type", "application/json")
+	req.Header.Set("Authorization", bearerKey)
 	if err != nil {
 		fmt.Println("Error creating request:", err)
 		return
 	}
+	client := &http.Client{}
+	resp, err := client.Do(req)
+	if err != nil {
+		fmt.Println("Error sending request:", err)
+		return
+	}
 	defer resp.Body.Close()
 
 	body, err := io.ReadAll(resp.Body)
@@ -72,14 +83,15 @@ func getModelStatus(url string) (response map[string]interface{}) {
 	return
 }
 
-func getModels(url string) (response []gallery.GalleryModel) {
+func getModels(url string) ([]gallery.GalleryModel, error) {
+	response := []gallery.GalleryModel{}
 	uri := downloader.URI(url)
 	// TODO: No tests currently seem to exercise file:// urls. Fix?
-	uri.DownloadAndUnmarshal("", func(url string, i []byte) error {
+	err := uri.DownloadWithAuthorizationAndCallback("", bearerKey, func(url string, i []byte) error {
 		// Unmarshal YAML data into a struct
 		return json.Unmarshal(i, &response)
 	})
-	return
+	return response, err
 }
 
 func postModelApplyRequest(url string, request modelApplyRequest) (response map[string]interface{}) {
@@ -101,6 +113,7 @@ func postModelApplyRequest(url string, request modelApplyRequest) (response map[
 		return
 	}
 	req.Header.Set("Content-Type", "application/json")
+	req.Header.Set("Authorization", bearerKey)
 
 	// Make the request
 	client := &http.Client{}
@@ -140,6 +153,7 @@ func postRequestJSON[B any](url string, bodyJson *B) error {
 	}
 
 	req.Header.Set("Content-Type", "application/json")
+	req.Header.Set("Authorization", bearerKey)
 
 	client := &http.Client{}
 	resp, err := client.Do(req)
@@ -175,6 +189,7 @@ func postRequestResponseJSON[B1 any, B2 any](url string, reqJson *B1, respJson *
 	}
 
 	req.Header.Set("Content-Type", "application/json")
+	req.Header.Set("Authorization", bearerKey)
 
 	client := &http.Client{}
 	resp, err := client.Do(req)
@@ -195,6 +210,35 @@ func postRequestResponseJSON[B1 any, B2 any](url string, reqJson *B1, respJson *
 	return json.Unmarshal(body, respJson)
 }
 
+func postInvalidRequest(url string) (error, int) {
+
+	req, err := http.NewRequest("POST", url, bytes.NewBufferString("invalid request"))
+	if err != nil {
+		return err, -1
+	}
+
+	req.Header.Set("Content-Type", "application/json")
+
+	client := &http.Client{}
+	resp, err := client.Do(req)
+	if err != nil {
+		return err, -1
+	}
+
+	defer resp.Body.Close()
+
+	body, err := io.ReadAll(resp.Body)
+	if err != nil {
+		return err, -1
+	}
+
+	if resp.StatusCode < 200 || resp.StatusCode >= 400 {
+		return fmt.Errorf("unexpected status code: %d, body: %s", resp.StatusCode, string(body)), resp.StatusCode
+	}
+
+	return nil, resp.StatusCode
+}
+
 //go:embed backend-assets/*
 var backendAssets embed.FS
 
@@ -260,6 +304,7 @@ var _ = Describe("API test", func() {
 					config.WithContext(c),
 					config.WithGalleries(galleries),
 					config.WithModelPath(modelDir),
+					config.WithApiKeys([]string{apiKey}),
 					config.WithBackendAssets(backendAssets),
 					config.WithBackendAssetsOutput(backendAssetsDir))...)
 			Expect(err).ToNot(HaveOccurred())
@@ -269,7 +314,7 @@ var _ = Describe("API test", func() {
 
 			go app.Listen("127.0.0.1:9090")
 
-			defaultConfig := openai.DefaultConfig("")
+			defaultConfig := openai.DefaultConfig(apiKey)
 			defaultConfig.BaseURL = "http://127.0.0.1:9090/v1"
 
 			client2 = openaigo.NewClient("")
@@ -295,10 +340,19 @@ var _ = Describe("API test", func() {
 			Expect(err).To(HaveOccurred())
 		})
 
+		Context("Auth Tests", func() {
+			It("Should fail if the api key is missing", func() {
+				err, sc := postInvalidRequest("http://127.0.0.1:9090/models/available")
+				Expect(err).ToNot(BeNil())
+				Expect(sc).To(Equal(403))
+			})
+		})
+
 		Context("Applying models", func() {
 
 			It("applies models from a gallery", func() {
-				models := getModels("http://127.0.0.1:9090/models/available")
+				models, err := getModels("http://127.0.0.1:9090/models/available")
+				Expect(err).To(BeNil())
 				Expect(len(models)).To(Equal(2), fmt.Sprint(models))
 				Expect(models[0].Installed).To(BeFalse(), fmt.Sprint(models))
 				Expect(models[1].Installed).To(BeFalse(), fmt.Sprint(models))
@@ -331,7 +385,8 @@ var _ = Describe("API test", func() {
 				Expect(content["backend"]).To(Equal("bert-embeddings"))
 				Expect(content["foo"]).To(Equal("bar"))
 
-				models = getModels("http://127.0.0.1:9090/models/available")
+				models, err = getModels("http://127.0.0.1:9090/models/available")
+				Expect(err).To(BeNil())
 				Expect(len(models)).To(Equal(2), fmt.Sprint(models))
 				Expect(models[0].Name).To(Or(Equal("bert"), Equal("bert2")))
 				Expect(models[1].Name).To(Or(Equal("bert"), Equal("bert2")))
diff --git a/core/http/middleware/auth.go b/core/http/middleware/auth.go
index bc8bcf80..d2152e9b 100644
--- a/core/http/middleware/auth.go
+++ b/core/http/middleware/auth.go
@@ -38,6 +38,7 @@ func getApiKeyErrorHandler(applicationConfig *config.ApplicationConfig) fiber.Er
 			if applicationConfig.OpaqueErrors {
 				return ctx.SendStatus(403)
 			}
+			return ctx.Status(403).SendString(err.Error())
 		}
 		if applicationConfig.OpaqueErrors {
 			return ctx.SendStatus(500)
@@ -90,4 +91,4 @@ func getApiKeyRequiredFilterFunction(applicationConfig *config.ApplicationConfig
 		}
 	}
 	return func(c *fiber.Ctx) bool { return false }
-}
\ No newline at end of file
+}
diff --git a/embedded/embedded.go b/embedded/embedded.go
index 672c32ed..3a4ea262 100644
--- a/embedded/embedded.go
+++ b/embedded/embedded.go
@@ -39,7 +39,7 @@ func init() {
 func GetRemoteLibraryShorteners(url string, basePath string) (map[string]string, error) {
 	remoteLibrary := map[string]string{}
 	uri := downloader.URI(url)
-	err := uri.DownloadAndUnmarshal(basePath, func(_ string, i []byte) error {
+	err := uri.DownloadWithCallback(basePath, func(_ string, i []byte) error {
 		return yaml.Unmarshal(i, &remoteLibrary)
 	})
 	if err != nil {
diff --git a/go.mod b/go.mod
index a3359abf..dd8fce9f 100644
--- a/go.mod
+++ b/go.mod
@@ -1,8 +1,8 @@
 module github.com/mudler/LocalAI
 
-go 1.22.0
+go 1.23
 
-toolchain go1.22.4
+toolchain go1.23.1
 
 require (
 	dario.cat/mergo v1.0.0
diff --git a/pkg/downloader/uri.go b/pkg/downloader/uri.go
index 7fedd646..9acbb621 100644
--- a/pkg/downloader/uri.go
+++ b/pkg/downloader/uri.go
@@ -31,7 +31,11 @@ const (
 
 type URI string
 
-func (uri URI) DownloadAndUnmarshal(basePath string, f func(url string, i []byte) error) error {
+func (uri URI) DownloadWithCallback(basePath string, f func(url string, i []byte) error) error {
+	return uri.DownloadWithAuthorizationAndCallback(basePath, "", f)
+}
+
+func (uri URI) DownloadWithAuthorizationAndCallback(basePath string, authorization string, f func(url string, i []byte) error) error {
 	url := uri.ResolveURL()
 
 	if strings.HasPrefix(url, LocalPrefix) {
@@ -41,7 +45,6 @@ func (uri URI) DownloadAndUnmarshal(basePath string, f func(url string, i []byte
 		if err != nil {
 			return err
 		}
-		// ???
 		resolvedBasePath, err := filepath.EvalSymlinks(basePath)
 		if err != nil {
 			return err
@@ -63,7 +66,16 @@ func (uri URI) DownloadAndUnmarshal(basePath string, f func(url string, i []byte
 	}
 
 	// Send a GET request to the URL
-	response, err := http.Get(url)
+
+	req, err := http.NewRequest("GET", url, nil)
+	if err != nil {
+		return err
+	}
+	if authorization != "" {
+		req.Header.Add("Authorization", authorization)
+	}
+
+	response, err := http.DefaultClient.Do(req)
 	if err != nil {
 		return err
 	}
diff --git a/pkg/downloader/uri_test.go b/pkg/downloader/uri_test.go
index 21a093a9..3b7a80b3 100644
--- a/pkg/downloader/uri_test.go
+++ b/pkg/downloader/uri_test.go
@@ -11,7 +11,7 @@ var _ = Describe("Gallery API tests", func() {
 		It("parses github with a branch", func() {
 			uri := URI("github:go-skynet/model-gallery/gpt4all-j.yaml")
 			Expect(
-				uri.DownloadAndUnmarshal("", func(url string, i []byte) error {
+				uri.DownloadWithCallback("", func(url string, i []byte) error {
 					Expect(url).To(Equal("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml"))
 					return nil
 				}),
@@ -21,7 +21,7 @@ var _ = Describe("Gallery API tests", func() {
 			uri := URI("github:go-skynet/model-gallery/gpt4all-j.yaml@main")
 
 			Expect(
-				uri.DownloadAndUnmarshal("", func(url string, i []byte) error {
+				uri.DownloadWithCallback("", func(url string, i []byte) error {
 					Expect(url).To(Equal("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml"))
 					return nil
 				}),
@@ -30,7 +30,7 @@ var _ = Describe("Gallery API tests", func() {
 		It("parses github with urls", func() {
 			uri := URI("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml")
 			Expect(
-				uri.DownloadAndUnmarshal("", func(url string, i []byte) error {
+				uri.DownloadWithCallback("", func(url string, i []byte) error {
 					Expect(url).To(Equal("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml"))
 					return nil
 				}),

From 0893d3cbbebc6f7c5fa1d65e4b17e7d900ae60d4 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Tue, 24 Sep 2024 20:25:59 +0200
Subject: [PATCH 078/122] fix(health): do not require auth for /healthz and
 /readyz (#3656)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* fix(health): do not require auth for /healthz and /readyz

Fixes: #3655

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Comment so I don’t forget

Adding a reminder here...

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Dave <dave@gray101.com>
---
 core/http/app.go            |  3 +++
 core/http/routes/health.go  | 13 +++++++++++++
 core/http/routes/localai.go |  8 --------
 3 files changed, 16 insertions(+), 8 deletions(-)
 create mode 100644 core/http/routes/health.go

diff --git a/core/http/app.go b/core/http/app.go
index 23e97f18..2cf0ad17 100644
--- a/core/http/app.go
+++ b/core/http/app.go
@@ -121,6 +121,9 @@ func App(cl *config.BackendConfigLoader, ml *model.ModelLoader, appConfig *confi
 		})
 	}
 
+ // Health Checks should always be exempt from auth, so register these first
+	routes.HealthRoutes(app)
+
 	kaConfig, err := middleware.GetKeyAuthConfig(appConfig)
 	if err != nil || kaConfig == nil {
 		return nil, fmt.Errorf("failed to create key auth config: %w", err)
diff --git a/core/http/routes/health.go b/core/http/routes/health.go
new file mode 100644
index 00000000..f5a08e9b
--- /dev/null
+++ b/core/http/routes/health.go
@@ -0,0 +1,13 @@
+package routes
+
+import "github.com/gofiber/fiber/v2"
+
+func HealthRoutes(app *fiber.App) {
+	// Service health checks
+	ok := func(c *fiber.Ctx) error {
+		return c.SendStatus(200)
+	}
+
+	app.Get("/healthz", ok)
+	app.Get("/readyz", ok)
+}
diff --git a/core/http/routes/localai.go b/core/http/routes/localai.go
index 247596c0..2f65e779 100644
--- a/core/http/routes/localai.go
+++ b/core/http/routes/localai.go
@@ -42,14 +42,6 @@ func RegisterLocalAIRoutes(app *fiber.App,
 	app.Post("/stores/get", localai.StoresGetEndpoint(sl, appConfig))
 	app.Post("/stores/find", localai.StoresFindEndpoint(sl, appConfig))
 
-	// Kubernetes health checks
-	ok := func(c *fiber.Ctx) error {
-		return c.SendStatus(200)
-	}
-
-	app.Get("/healthz", ok)
-	app.Get("/readyz", ok)
-
 	app.Get("/metrics", localai.LocalAIMetricsEndpoint())
 
 	// Experimental Backend Statistics Module

From 6555994060662db4c3600c0d51b18e10f5cac890 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 24 Sep 2024 21:22:08 +0000
Subject: [PATCH 079/122] chore(deps): Bump sentence-transformers from 3.1.0 to
 3.1.1 in /backend/python/sentencetransformers (#3651)

chore(deps): Bump sentence-transformers

Bumps [sentence-transformers](https://github.com/UKPLab/sentence-transformers) from 3.1.0 to 3.1.1.
- [Release notes](https://github.com/UKPLab/sentence-transformers/releases)
- [Commits](https://github.com/UKPLab/sentence-transformers/compare/v3.1.0...v3.1.1)

---
updated-dependencies:
- dependency-name: sentence-transformers
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/sentencetransformers/requirements-cpu.txt      | 2 +-
 backend/python/sentencetransformers/requirements-cublas11.txt | 2 +-
 backend/python/sentencetransformers/requirements-cublas12.txt | 2 +-
 backend/python/sentencetransformers/requirements-hipblas.txt  | 2 +-
 backend/python/sentencetransformers/requirements-intel.txt    | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/backend/python/sentencetransformers/requirements-cpu.txt b/backend/python/sentencetransformers/requirements-cpu.txt
index f88de1e4..0fd8f35e 100644
--- a/backend/python/sentencetransformers/requirements-cpu.txt
+++ b/backend/python/sentencetransformers/requirements-cpu.txt
@@ -2,5 +2,5 @@ torch
 accelerate
 transformers
 bitsandbytes
-sentence-transformers==3.1.0
+sentence-transformers==3.1.1
 transformers
\ No newline at end of file
diff --git a/backend/python/sentencetransformers/requirements-cublas11.txt b/backend/python/sentencetransformers/requirements-cublas11.txt
index 57caf1a1..92a10b16 100644
--- a/backend/python/sentencetransformers/requirements-cublas11.txt
+++ b/backend/python/sentencetransformers/requirements-cublas11.txt
@@ -1,5 +1,5 @@
 --extra-index-url https://download.pytorch.org/whl/cu118
 torch
 accelerate
-sentence-transformers==3.1.0
+sentence-transformers==3.1.1
 transformers
\ No newline at end of file
diff --git a/backend/python/sentencetransformers/requirements-cublas12.txt b/backend/python/sentencetransformers/requirements-cublas12.txt
index 834fa6a4..f68bb1b9 100644
--- a/backend/python/sentencetransformers/requirements-cublas12.txt
+++ b/backend/python/sentencetransformers/requirements-cublas12.txt
@@ -1,4 +1,4 @@
 torch
 accelerate
-sentence-transformers==3.1.0
+sentence-transformers==3.1.1
 transformers
\ No newline at end of file
diff --git a/backend/python/sentencetransformers/requirements-hipblas.txt b/backend/python/sentencetransformers/requirements-hipblas.txt
index 98a0a41b..920eb855 100644
--- a/backend/python/sentencetransformers/requirements-hipblas.txt
+++ b/backend/python/sentencetransformers/requirements-hipblas.txt
@@ -1,5 +1,5 @@
 --extra-index-url https://download.pytorch.org/whl/rocm6.0
 torch
 accelerate
-sentence-transformers==3.1.0
+sentence-transformers==3.1.1
 transformers
\ No newline at end of file
diff --git a/backend/python/sentencetransformers/requirements-intel.txt b/backend/python/sentencetransformers/requirements-intel.txt
index 5948910d..6ae4bdd4 100644
--- a/backend/python/sentencetransformers/requirements-intel.txt
+++ b/backend/python/sentencetransformers/requirements-intel.txt
@@ -4,5 +4,5 @@ torch
 optimum[openvino]
 setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406
 accelerate
-sentence-transformers==3.1.0
+sentence-transformers==3.1.1
 transformers
\ No newline at end of file

From c54cfd3609489c648859736b3038a322339a8bfd Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 24 Sep 2024 22:59:11 +0000
Subject: [PATCH 080/122] chore(deps): Bump pydantic from 2.8.2 to 2.9.2 in
 /examples/langchain/langchainpy-localai-example (#3648)

chore(deps): Bump pydantic

Bumps [pydantic](https://github.com/pydantic/pydantic) from 2.8.2 to 2.9.2.
- [Release notes](https://github.com/pydantic/pydantic/releases)
- [Changelog](https://github.com/pydantic/pydantic/blob/main/HISTORY.md)
- [Commits](https://github.com/pydantic/pydantic/compare/v2.8.2...v2.9.2)

---
updated-dependencies:
- dependency-name: pydantic
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index ac147410..179abc2a 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -21,7 +21,7 @@ numpy==2.1.1
 openai==1.45.1
 openapi-schema-pydantic==1.2.4
 packaging>=23.2
-pydantic==2.8.2
+pydantic==2.9.2
 PyYAML==6.0.2
 requests==2.32.3
 SQLAlchemy==2.0.35

From 0d784f46e55e39fb988c171c32ef664c9ff2801c Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Wed, 25 Sep 2024 01:15:53 +0000
Subject: [PATCH 081/122] chore(deps): Bump openai from 1.45.1 to 1.47.1 in
 /examples/functions (#3645)

Bumps [openai](https://github.com/openai/openai-python) from 1.45.1 to 1.47.1.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.45.1...v1.47.1)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/functions/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt
index 670090d3..c3ffad01 100644
--- a/examples/functions/requirements.txt
+++ b/examples/functions/requirements.txt
@@ -1,2 +1,2 @@
 langchain==0.3.0
-openai==1.45.1
+openai==1.47.1

From aa87eff28330a65818842884515ca1806165c209 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Wed, 25 Sep 2024 06:51:20 +0200
Subject: [PATCH 082/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `70392f1f81470607ba3afef04aa56c9f65587664` (#3659)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Dave <dave@gray101.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 7523d5ff..6865f5a1 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=f0c7b5edf82aa200656fd88c11ae3a805d7130bf
+CPPLLAMA_VERSION?=70392f1f81470607ba3afef04aa56c9f65587664
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From a370a11115879a9e02410f55136f563391976254 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Wed, 25 Sep 2024 08:47:03 +0200
Subject: [PATCH 083/122] docs: :arrow_up: update docs version mudler/LocalAI
 (#3657)

:arrow_up: Update docs version mudler/LocalAI

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 docs/data/version.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/data/version.json b/docs/data/version.json
index dc128c66..0dba0428 100644
--- a/docs/data/version.json
+++ b/docs/data/version.json
@@ -1,3 +1,3 @@
 {
-  "version": "v2.20.1"
+  "version": "v2.21.0"
 }

From 1b8a77433a88ce1a56d364b4dc81d9030f4e2830 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Wed, 25 Sep 2024 08:47:33 +0200
Subject: [PATCH 084/122] chore(deps): Bump llama-index from 0.11.7 to 0.11.12
 in /examples/langchain-chroma (#3639)

chore(deps): Bump llama-index in /examples/langchain-chroma

Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.7 to 0.11.12.
- [Release notes](https://github.com/run-llama/llama_index/releases)
- [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md)
- [Commits](https://github.com/run-llama/llama_index/compare/v0.11.7...v0.11.12)

---
updated-dependencies:
- dependency-name: llama-index
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain-chroma/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt
index 4884d4aa..3f7bec69 100644
--- a/examples/langchain-chroma/requirements.txt
+++ b/examples/langchain-chroma/requirements.txt
@@ -1,4 +1,4 @@
 langchain==0.3.0
 openai==1.45.1
 chromadb==0.5.5
-llama-index==0.11.7
\ No newline at end of file
+llama-index==0.11.12
\ No newline at end of file

From 8002ad27cb7b67f8489a5f3cda66437acf2aac74 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Wed, 25 Sep 2024 08:47:57 +0200
Subject: [PATCH 085/122] chore(deps): Bump openai from 1.45.1 to 1.47.1 in
 /examples/langchain-chroma (#3641)

chore(deps): Bump openai in /examples/langchain-chroma

Bumps [openai](https://github.com/openai/openai-python) from 1.45.1 to 1.47.1.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.45.1...v1.47.1)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain-chroma/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt
index 3f7bec69..0c77892d 100644
--- a/examples/langchain-chroma/requirements.txt
+++ b/examples/langchain-chroma/requirements.txt
@@ -1,4 +1,4 @@
 langchain==0.3.0
-openai==1.45.1
+openai==1.47.1
 chromadb==0.5.5
 llama-index==0.11.12
\ No newline at end of file

From 8c4f720fb578b3156c333448f298a55845857c58 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Wed, 25 Sep 2024 08:48:13 +0200
Subject: [PATCH 086/122] chore(deps): Bump llama-index from 0.11.9 to 0.11.12
 in /examples/chainlit (#3642)

chore(deps): Bump llama-index in /examples/chainlit

Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.9 to 0.11.12.
- [Release notes](https://github.com/run-llama/llama_index/releases)
- [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md)
- [Commits](https://github.com/run-llama/llama_index/compare/v0.11.9...v0.11.12)

---
updated-dependencies:
- dependency-name: llama-index
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/chainlit/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/chainlit/requirements.txt b/examples/chainlit/requirements.txt
index 1fe9356a..92eb113e 100644
--- a/examples/chainlit/requirements.txt
+++ b/examples/chainlit/requirements.txt
@@ -1,4 +1,4 @@
-llama_index==0.11.9
+llama_index==0.11.12
 requests==2.32.3
 weaviate_client==4.8.1
 transformers

From 74408bdc77e9f9d21a56699de09940fcaaf1a4eb Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Wed, 25 Sep 2024 10:54:37 +0200
Subject: [PATCH 087/122] chore: :arrow_up: Update ggerganov/whisper.cpp to
 `0d2e2aed80109e8696791083bde3b58e190b7812` (#3658)

:arrow_up: Update ggerganov/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Dave <dave@gray101.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 6865f5a1..121b8e50 100644
--- a/Makefile
+++ b/Makefile
@@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6
 
 # whisper.cpp version
 WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp
-WHISPER_CPP_VERSION?=34972dbe221709323714fc8402f2e24041d48213
+WHISPER_CPP_VERSION?=0d2e2aed80109e8696791083bde3b58e190b7812
 
 # bert.cpp version
 BERT_REPO?=https://github.com/go-skynet/go-bert.cpp

From 33b2d38dd0198d78dbc26aa020acfb6ff4c4048c Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Wed, 25 Sep 2024 12:44:32 +0200
Subject: [PATCH 088/122] chore(deps): Bump chromadb from 0.5.5 to 0.5.7 in
 /examples/langchain-chroma (#3640)

chore(deps): Bump chromadb in /examples/langchain-chroma

Bumps [chromadb](https://github.com/chroma-core/chroma) from 0.5.5 to 0.5.7.
- [Release notes](https://github.com/chroma-core/chroma/releases)
- [Changelog](https://github.com/chroma-core/chroma/blob/main/RELEASE_PROCESS.md)
- [Commits](https://github.com/chroma-core/chroma/compare/0.5.5...0.5.7)

---
updated-dependencies:
- dependency-name: chromadb
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain-chroma/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt
index 0c77892d..19929482 100644
--- a/examples/langchain-chroma/requirements.txt
+++ b/examples/langchain-chroma/requirements.txt
@@ -1,4 +1,4 @@
 langchain==0.3.0
 openai==1.47.1
-chromadb==0.5.5
+chromadb==0.5.7
 llama-index==0.11.12
\ No newline at end of file

From a3d69872e35e152f29f7888fa9c56b0a797e9723 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Wed, 25 Sep 2024 18:00:23 +0200
Subject: [PATCH 089/122] feat(api): list loaded models in `/system` (#3661)

feat(api): list loaded models in /system

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 core/http/endpoints/localai/system.go | 2 ++
 core/schema/localai.go                | 4 +++-
 pkg/model/initializers.go             | 7 +++----
 pkg/model/loader.go                   | 6 +++---
 pkg/model/loader_test.go              | 4 ++--
 pkg/model/model.go                    | 4 +++-
 6 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/core/http/endpoints/localai/system.go b/core/http/endpoints/localai/system.go
index 11704933..23a725e3 100644
--- a/core/http/endpoints/localai/system.go
+++ b/core/http/endpoints/localai/system.go
@@ -17,12 +17,14 @@ func SystemInformations(ml *model.ModelLoader, appConfig *config.ApplicationConf
 		if err != nil {
 			return err
 		}
+		loadedModels := ml.ListModels()
 		for b := range appConfig.ExternalGRPCBackends {
 			availableBackends = append(availableBackends, b)
 		}
 		return c.JSON(
 			schema.SystemInformationResponse{
 				Backends: availableBackends,
+				Models:   loadedModels,
 			},
 		)
 	}
diff --git a/core/schema/localai.go b/core/schema/localai.go
index 9070c2be..75fa40c7 100644
--- a/core/schema/localai.go
+++ b/core/schema/localai.go
@@ -2,6 +2,7 @@ package schema
 
 import (
 	"github.com/mudler/LocalAI/core/p2p"
+	"github.com/mudler/LocalAI/pkg/model"
 	gopsutil "github.com/shirou/gopsutil/v3/process"
 )
 
@@ -72,5 +73,6 @@ type P2PNodesResponse struct {
 }
 
 type SystemInformationResponse struct {
-	Backends []string `json:"backends"`
+	Backends []string      `json:"backends"`
+	Models   []model.Model `json:"loaded_models"`
 }
diff --git a/pkg/model/initializers.go b/pkg/model/initializers.go
index 7099bf33..80dd10b4 100644
--- a/pkg/model/initializers.go
+++ b/pkg/model/initializers.go
@@ -311,11 +311,11 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string
 
 				log.Debug().Msgf("GRPC Service Started")
 
-				client = NewModel(serverAddress)
+				client = NewModel(modelName, serverAddress)
 			} else {
 				log.Debug().Msg("external backend is uri")
 				// address
-				client = NewModel(uri)
+				client = NewModel(modelName, uri)
 			}
 		} else {
 			grpcProcess := backendPath(o.assetDir, backend)
@@ -352,7 +352,7 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string
 
 			log.Debug().Msgf("GRPC Service Started")
 
-			client = NewModel(serverAddress)
+			client = NewModel(modelName, serverAddress)
 		}
 
 		log.Debug().Msgf("Wait for the service to start up")
@@ -419,7 +419,6 @@ func (ml *ModelLoader) BackendLoader(opts ...Option) (client grpc.Backend, err e
 		err := ml.StopGRPC(allExcept(o.model))
 		if err != nil {
 			log.Error().Err(err).Str("keptModel", o.model).Msg("error while shutting down all backends except for the keptModel")
-			return nil, err
 		}
 	}
 
diff --git a/pkg/model/loader.go b/pkg/model/loader.go
index f70d2cea..4f1ec841 100644
--- a/pkg/model/loader.go
+++ b/pkg/model/loader.go
@@ -105,13 +105,13 @@ FILE:
 	return models, nil
 }
 
-func (ml *ModelLoader) ListModels() []*Model {
+func (ml *ModelLoader) ListModels() []Model {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
 
-	models := []*Model{}
+	models := []Model{}
 	for _, model := range ml.models {
-		models = append(models, model)
+		models = append(models, *model)
 	}
 
 	return models
diff --git a/pkg/model/loader_test.go b/pkg/model/loader_test.go
index 4621844e..c16a6e50 100644
--- a/pkg/model/loader_test.go
+++ b/pkg/model/loader_test.go
@@ -63,7 +63,7 @@ var _ = Describe("ModelLoader", func() {
 
 	Context("LoadModel", func() {
 		It("should load a model and keep it in memory", func() {
-			mockModel = model.NewModel("test.model")
+			mockModel = model.NewModel("foo", "test.model")
 
 			mockLoader := func(modelName, modelFile string) (*model.Model, error) {
 				return mockModel, nil
@@ -88,7 +88,7 @@ var _ = Describe("ModelLoader", func() {
 
 	Context("ShutdownModel", func() {
 		It("should shutdown a loaded model", func() {
-			mockModel = model.NewModel("test.model")
+			mockModel = model.NewModel("foo", "test.model")
 
 			mockLoader := func(modelName, modelFile string) (*model.Model, error) {
 				return mockModel, nil
diff --git a/pkg/model/model.go b/pkg/model/model.go
index 1927dc0c..6cb81d10 100644
--- a/pkg/model/model.go
+++ b/pkg/model/model.go
@@ -3,12 +3,14 @@ package model
 import grpc "github.com/mudler/LocalAI/pkg/grpc"
 
 type Model struct {
+	ID      string `json:"id"`
 	address string
 	client  grpc.Backend
 }
 
-func NewModel(address string) *Model {
+func NewModel(ID, address string) *Model {
 	return &Model{
+		ID:      ID,
 		address: address,
 	}
 }

From ef1507d000f2308f395a341d6c497de70427f1a5 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Thu, 26 Sep 2024 10:50:20 +0200
Subject: [PATCH 090/122] docs: :arrow_up: update docs version mudler/LocalAI
 (#3665)

:arrow_up: Update docs version mudler/LocalAI

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 docs/data/version.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/data/version.json b/docs/data/version.json
index 0dba0428..470991b8 100644
--- a/docs/data/version.json
+++ b/docs/data/version.json
@@ -1,3 +1,3 @@
 {
-  "version": "v2.21.0"
+  "version": "v2.21.1"
 }

From d6522e69ca0f972b2d0d8f617b1cc131ac5026c6 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Thu, 26 Sep 2024 10:57:40 +0200
Subject: [PATCH 091/122] feat(swagger): update swagger (#3664)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 swagger/docs.go      | 14 ++++++++++++++
 swagger/swagger.json | 14 ++++++++++++++
 swagger/swagger.yaml |  9 +++++++++
 3 files changed, 37 insertions(+)

diff --git a/swagger/docs.go b/swagger/docs.go
index ffb2ba03..c283dcb0 100644
--- a/swagger/docs.go
+++ b/swagger/docs.go
@@ -972,6 +972,14 @@ const docTemplate = `{
                 }
             }
         },
+        "model.Model": {
+            "type": "object",
+            "properties": {
+                "id": {
+                    "type": "string"
+                }
+            }
+        },
         "openai.Assistant": {
             "type": "object",
             "properties": {
@@ -1682,6 +1690,12 @@ const docTemplate = `{
                     "items": {
                         "type": "string"
                     }
+                },
+                "loaded_models": {
+                    "type": "array",
+                    "items": {
+                        "$ref": "#/definitions/model.Model"
+                    }
                 }
             }
         },
diff --git a/swagger/swagger.json b/swagger/swagger.json
index e3aebe43..0a3be179 100644
--- a/swagger/swagger.json
+++ b/swagger/swagger.json
@@ -965,6 +965,14 @@
                 }
             }
         },
+        "model.Model": {
+            "type": "object",
+            "properties": {
+                "id": {
+                    "type": "string"
+                }
+            }
+        },
         "openai.Assistant": {
             "type": "object",
             "properties": {
@@ -1675,6 +1683,12 @@
                     "items": {
                         "type": "string"
                     }
+                },
+                "loaded_models": {
+                    "type": "array",
+                    "items": {
+                        "$ref": "#/definitions/model.Model"
+                    }
                 }
             }
         },
diff --git a/swagger/swagger.yaml b/swagger/swagger.yaml
index 649b86e4..7b6619b4 100644
--- a/swagger/swagger.yaml
+++ b/swagger/swagger.yaml
@@ -168,6 +168,11 @@ definitions:
           type: string
         type: array
     type: object
+  model.Model:
+    properties:
+      id:
+        type: string
+    type: object
   openai.Assistant:
     properties:
       created:
@@ -652,6 +657,10 @@ definitions:
         items:
           type: string
         type: array
+      loaded_models:
+        items:
+          $ref: '#/definitions/model.Model'
+        type: array
     type: object
   schema.TTSRequest:
     description: TTS request body

From 3d12d2037c83f9d5d3ae832e97311b29547532e1 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Thu, 26 Sep 2024 11:19:26 +0200
Subject: [PATCH 092/122] models(gallery): add llama-3.2 3B and 1B (#3671)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 60 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 9b8a0220..de38c3d5 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -1,4 +1,64 @@
 ---
+## llama3.2
+- &llama32
+  url: "github:mudler/LocalAI/gallery/llama3.1-instruct.yaml@master"
+  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png
+  license: llama3.2
+  description: |
+    The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks.
+
+    Model Developer: Meta
+
+    Model Architecture: Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
+  tags:
+    - llm
+    - gguf
+    - gpu
+    - cpu
+    - llama3.2
+  name: "llama-3.2-1b-instruct:q4_k_m"
+  urls:
+    - https://huggingface.co/hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
+  overrides:
+    parameters:
+      model: llama-3.2-1b-instruct-q4_k_m.gguf
+  files:
+    - filename: llama-3.2-1b-instruct-q4_k_m.gguf
+      sha256: 1d0e9419ec4e12aef73ccf4ffd122703e94c48344a96bc7c5f0f2772c2152ce3
+      uri: huggingface://hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF/llama-3.2-1b-instruct-q4_k_m.gguf
+- !!merge <<: *llama32
+  name: "llama-3.2-3b-instruct:q4_k_m"
+  urls:
+    - https://huggingface.co/hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
+  overrides:
+    parameters:
+      model: llama-3.2-3b-instruct-q4_k_m.gguf
+  files:
+    - filename: llama-3.2-3b-instruct-q4_k_m.gguf
+      sha256: c55a83bfb6396799337853ca69918a0b9bbb2917621078c34570bc17d20fd7a1
+      uri: huggingface://hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF/llama-3.2-3b-instruct-q4_k_m.gguf
+- !!merge <<: *llama32
+  name: "llama-3.2-3b-instruct:q8_0"
+  urls:
+    - https://huggingface.co/hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
+  overrides:
+    parameters:
+      model: llama-3.2-3b-instruct-q8_0.gguf
+  files:
+    - filename: llama-3.2-3b-instruct-q8_0.gguf
+      sha256: 51725f77f997a5080c3d8dd66e073da22ddf48ab5264f21f05ded9b202c3680e
+      uri: huggingface://hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF/llama-3.2-3b-instruct-q8_0.gguf
+- !!merge <<: *llama32
+  name: "llama-3.2-1b-instruct:q8_0"
+  urls:
+    - https://huggingface.co/hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
+  overrides:
+    parameters:
+      model: llama-3.2-1b-instruct-q8_0.gguf
+  files:
+    - filename: llama-3.2-1b-instruct-q8_0.gguf
+      sha256: ba345c83bf5cc679c653b853c46517eea5a34f03ed2205449db77184d9ae62a9
+      uri: huggingface://hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF/llama-3.2-1b-instruct-q8_0.gguf
 ## Qwen2.5
 - &qwen25
   name: "qwen2.5-14b-instruct"

From fa5c98549aae32df63a9c3e34574701e45287d29 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Thu, 26 Sep 2024 12:44:55 +0200
Subject: [PATCH 093/122] chore(refactor): track grpcProcess in the model
 structure (#3663)

* chore(refactor): track grpcProcess in the model structure

This avoids to have to handle in two parts the data relative to the same
model. It makes it easier to track and use mutex with.

This also fixes races conditions while accessing to the model.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): run protogen-go before starting aio tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): install protoc in aio tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 .github/workflows/test.yml | 11 ++++++++++-
 Makefile                   |  2 +-
 pkg/model/initializers.go  | 15 ++++++++++-----
 pkg/model/loader.go        | 32 ++++++++++++++------------------
 pkg/model/loader_test.go   |  4 ++--
 pkg/model/model.go         | 18 ++++++++++++++++--
 pkg/model/process.go       | 33 ++++++++++++++++++---------------
 7 files changed, 71 insertions(+), 44 deletions(-)

diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
index 2af3fd00..b62f86ef 100644
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -178,13 +178,22 @@ jobs:
         uses: actions/checkout@v4
         with:
           submodules: true
+      - name: Dependencies
+        run: |
+          # Install protoc
+          curl -L -s https://github.com/protocolbuffers/protobuf/releases/download/v26.1/protoc-26.1-linux-x86_64.zip -o protoc.zip && \
+          unzip -j -d /usr/local/bin protoc.zip bin/protoc && \
+          rm protoc.zip
+          go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2
+          go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@1958fcbe2ca8bd93af633f11e97d44e567e945af
+          PATH="$PATH:$HOME/go/bin" make protogen-go
       - name: Build images
         run: |
           docker build --build-arg FFMPEG=true --build-arg IMAGE_TYPE=extras --build-arg EXTRA_BACKENDS=rerankers --build-arg MAKEFLAGS="--jobs=5 --output-sync=target" -t local-ai:tests -f Dockerfile .
           BASE_IMAGE=local-ai:tests DOCKER_AIO_IMAGE=local-ai-aio:test make docker-aio
       - name: Test
         run: |
-          LOCALAI_MODELS_DIR=$PWD/models LOCALAI_IMAGE_TAG=test LOCALAI_IMAGE=local-ai-aio \
+            PATH="$PATH:$HOME/go/bin" LOCALAI_MODELS_DIR=$PWD/models LOCALAI_IMAGE_TAG=test LOCALAI_IMAGE=local-ai-aio \
             make run-e2e-aio
       - name: Setup tmate session if tests fail
         if: ${{ failure() }}
diff --git a/Makefile b/Makefile
index 121b8e50..4efee986 100644
--- a/Makefile
+++ b/Makefile
@@ -468,7 +468,7 @@ run-e2e-image:
 	ls -liah $(abspath ./tests/e2e-fixtures)
 	docker run -p 5390:8080 -e MODELS_PATH=/models -e THREADS=1 -e DEBUG=true -d --rm -v $(TEST_DIR):/models --gpus all --name e2e-tests-$(RANDOM) localai-tests
 
-run-e2e-aio:
+run-e2e-aio: protogen-go
 	@echo 'Running e2e AIO tests'
 	$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --flake-attempts 5 -v -r ./tests/e2e-aio
 
diff --git a/pkg/model/initializers.go b/pkg/model/initializers.go
index 80dd10b4..d0f47373 100644
--- a/pkg/model/initializers.go
+++ b/pkg/model/initializers.go
@@ -304,18 +304,19 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string
 					return nil, fmt.Errorf("failed allocating free ports: %s", err.Error())
 				}
 				// Make sure the process is executable
-				if err := ml.startProcess(uri, o.model, serverAddress); err != nil {
+				process, err := ml.startProcess(uri, o.model, serverAddress)
+				if err != nil {
 					log.Error().Err(err).Str("path", uri).Msg("failed to launch ")
 					return nil, err
 				}
 
 				log.Debug().Msgf("GRPC Service Started")
 
-				client = NewModel(modelName, serverAddress)
+				client = NewModel(modelName, serverAddress, process)
 			} else {
 				log.Debug().Msg("external backend is uri")
 				// address
-				client = NewModel(modelName, uri)
+				client = NewModel(modelName, uri, nil)
 			}
 		} else {
 			grpcProcess := backendPath(o.assetDir, backend)
@@ -346,13 +347,14 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string
 			args, grpcProcess = library.LoadLDSO(o.assetDir, args, grpcProcess)
 
 			// Make sure the process is executable in any circumstance
-			if err := ml.startProcess(grpcProcess, o.model, serverAddress, args...); err != nil {
+			process, err := ml.startProcess(grpcProcess, o.model, serverAddress, args...)
+			if err != nil {
 				return nil, err
 			}
 
 			log.Debug().Msgf("GRPC Service Started")
 
-			client = NewModel(modelName, serverAddress)
+			client = NewModel(modelName, serverAddress, process)
 		}
 
 		log.Debug().Msgf("Wait for the service to start up")
@@ -374,6 +376,7 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string
 
 		if !ready {
 			log.Debug().Msgf("GRPC Service NOT ready")
+			ml.deleteProcess(o.model)
 			return nil, fmt.Errorf("grpc service not ready")
 		}
 
@@ -385,9 +388,11 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string
 
 		res, err := client.GRPC(o.parallelRequests, ml.wd).LoadModel(o.context, &options)
 		if err != nil {
+			ml.deleteProcess(o.model)
 			return nil, fmt.Errorf("could not load model: %w", err)
 		}
 		if !res.Success {
+			ml.deleteProcess(o.model)
 			return nil, fmt.Errorf("could not load model (no success): %s", res.Message)
 		}
 
diff --git a/pkg/model/loader.go b/pkg/model/loader.go
index 4f1ec841..68ac1a31 100644
--- a/pkg/model/loader.go
+++ b/pkg/model/loader.go
@@ -13,7 +13,6 @@ import (
 
 	"github.com/mudler/LocalAI/pkg/utils"
 
-	process "github.com/mudler/go-processmanager"
 	"github.com/rs/zerolog/log"
 )
 
@@ -21,20 +20,18 @@ import (
 
 // TODO: Split ModelLoader and TemplateLoader? Just to keep things more organized. Left together to share a mutex until I look into that. Would split if we seperate directories for .bin/.yaml and .tmpl
 type ModelLoader struct {
-	ModelPath     string
-	mu            sync.Mutex
-	models        map[string]*Model
-	grpcProcesses map[string]*process.Process
-	templates     *templates.TemplateCache
-	wd            *WatchDog
+	ModelPath string
+	mu        sync.Mutex
+	models    map[string]*Model
+	templates *templates.TemplateCache
+	wd        *WatchDog
 }
 
 func NewModelLoader(modelPath string) *ModelLoader {
 	nml := &ModelLoader{
-		ModelPath:     modelPath,
-		models:        make(map[string]*Model),
-		templates:     templates.NewTemplateCache(modelPath),
-		grpcProcesses: make(map[string]*process.Process),
+		ModelPath: modelPath,
+		models:    make(map[string]*Model),
+		templates: templates.NewTemplateCache(modelPath),
 	}
 
 	return nml
@@ -127,6 +124,8 @@ func (ml *ModelLoader) LoadModel(modelName string, loader func(string, string) (
 	modelFile := filepath.Join(ml.ModelPath, modelName)
 	log.Debug().Msgf("Loading model in memory from file: %s", modelFile)
 
+	ml.mu.Lock()
+	defer ml.mu.Unlock()
 	model, err := loader(modelName, modelFile)
 	if err != nil {
 		return nil, err
@@ -136,8 +135,6 @@ func (ml *ModelLoader) LoadModel(modelName string, loader func(string, string) (
 		return nil, fmt.Errorf("loader didn't return a model")
 	}
 
-	ml.mu.Lock()
-	defer ml.mu.Unlock()
 	ml.models[modelName] = model
 
 	return model, nil
@@ -146,14 +143,13 @@ func (ml *ModelLoader) LoadModel(modelName string, loader func(string, string) (
 func (ml *ModelLoader) ShutdownModel(modelName string) error {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-
-	_, ok := ml.models[modelName]
+	model, ok := ml.models[modelName]
 	if !ok {
 		return fmt.Errorf("model %s not found", modelName)
 	}
 
 	retries := 1
-	for ml.models[modelName].GRPC(false, ml.wd).IsBusy() {
+	for model.GRPC(false, ml.wd).IsBusy() {
 		log.Debug().Msgf("%s busy. Waiting.", modelName)
 		dur := time.Duration(retries*2) * time.Second
 		if dur > retryTimeout {
@@ -185,8 +181,8 @@ func (ml *ModelLoader) CheckIsLoaded(s string) *Model {
 	if !alive {
 		log.Warn().Msgf("GRPC Model not responding: %s", err.Error())
 		log.Warn().Msgf("Deleting the process in order to recreate it")
-		process, exists := ml.grpcProcesses[s]
-		if !exists {
+		process := m.Process()
+		if process == nil {
 			log.Error().Msgf("Process not found for '%s' and the model is not responding anymore !", s)
 			return m
 		}
diff --git a/pkg/model/loader_test.go b/pkg/model/loader_test.go
index c16a6e50..d0ad4e0c 100644
--- a/pkg/model/loader_test.go
+++ b/pkg/model/loader_test.go
@@ -63,7 +63,7 @@ var _ = Describe("ModelLoader", func() {
 
 	Context("LoadModel", func() {
 		It("should load a model and keep it in memory", func() {
-			mockModel = model.NewModel("foo", "test.model")
+			mockModel = model.NewModel("foo", "test.model", nil)
 
 			mockLoader := func(modelName, modelFile string) (*model.Model, error) {
 				return mockModel, nil
@@ -88,7 +88,7 @@ var _ = Describe("ModelLoader", func() {
 
 	Context("ShutdownModel", func() {
 		It("should shutdown a loaded model", func() {
-			mockModel = model.NewModel("foo", "test.model")
+			mockModel = model.NewModel("foo", "test.model", nil)
 
 			mockLoader := func(modelName, modelFile string) (*model.Model, error) {
 				return mockModel, nil
diff --git a/pkg/model/model.go b/pkg/model/model.go
index 6cb81d10..6e4fd316 100644
--- a/pkg/model/model.go
+++ b/pkg/model/model.go
@@ -1,20 +1,32 @@
 package model
 
-import grpc "github.com/mudler/LocalAI/pkg/grpc"
+import (
+	"sync"
+
+	grpc "github.com/mudler/LocalAI/pkg/grpc"
+	process "github.com/mudler/go-processmanager"
+)
 
 type Model struct {
 	ID      string `json:"id"`
 	address string
 	client  grpc.Backend
+	process *process.Process
+	sync.Mutex
 }
 
-func NewModel(ID, address string) *Model {
+func NewModel(ID, address string, process *process.Process) *Model {
 	return &Model{
 		ID:      ID,
 		address: address,
+		process: process,
 	}
 }
 
+func (m *Model) Process() *process.Process {
+	return m.process
+}
+
 func (m *Model) GRPC(parallel bool, wd *WatchDog) grpc.Backend {
 	if m.client != nil {
 		return m.client
@@ -25,6 +37,8 @@ func (m *Model) GRPC(parallel bool, wd *WatchDog) grpc.Backend {
 		enableWD = true
 	}
 
+	m.Lock()
+	defer m.Unlock()
 	m.client = grpc.NewClient(m.address, parallel, wd, enableWD)
 	return m.client
 }
diff --git a/pkg/model/process.go b/pkg/model/process.go
index bcd1fccb..48631d79 100644
--- a/pkg/model/process.go
+++ b/pkg/model/process.go
@@ -16,20 +16,22 @@ import (
 )
 
 func (ml *ModelLoader) deleteProcess(s string) error {
-	if _, exists := ml.grpcProcesses[s]; exists {
-		if err := ml.grpcProcesses[s].Stop(); err != nil {
-			log.Error().Err(err).Msgf("(deleteProcess) error while deleting grpc process %s", s)
+	if m, exists := ml.models[s]; exists {
+		process := m.Process()
+		if process != nil {
+			if err := process.Stop(); err != nil {
+				log.Error().Err(err).Msgf("(deleteProcess) error while deleting process %s", s)
+			}
 		}
 	}
-	delete(ml.grpcProcesses, s)
 	delete(ml.models, s)
 	return nil
 }
 
 func (ml *ModelLoader) StopGRPC(filter GRPCProcessFilter) error {
 	var err error = nil
-	for k, p := range ml.grpcProcesses {
-		if filter(k, p) {
+	for k, m := range ml.models {
+		if filter(k, m.Process()) {
 			e := ml.ShutdownModel(k)
 			err = errors.Join(err, e)
 		}
@@ -44,17 +46,20 @@ func (ml *ModelLoader) StopAllGRPC() error {
 func (ml *ModelLoader) GetGRPCPID(id string) (int, error) {
 	ml.mu.Lock()
 	defer ml.mu.Unlock()
-	p, exists := ml.grpcProcesses[id]
+	p, exists := ml.models[id]
 	if !exists {
 		return -1, fmt.Errorf("no grpc backend found for %s", id)
 	}
-	return strconv.Atoi(p.PID)
+	if p.Process() == nil {
+		return -1, fmt.Errorf("no grpc backend found for %s", id)
+	}
+	return strconv.Atoi(p.Process().PID)
 }
 
-func (ml *ModelLoader) startProcess(grpcProcess, id string, serverAddress string, args ...string) error {
+func (ml *ModelLoader) startProcess(grpcProcess, id string, serverAddress string, args ...string) (*process.Process, error) {
 	// Make sure the process is executable
 	if err := os.Chmod(grpcProcess, 0700); err != nil {
-		return err
+		return nil, err
 	}
 
 	log.Debug().Msgf("Loading GRPC Process: %s", grpcProcess)
@@ -63,7 +68,7 @@ func (ml *ModelLoader) startProcess(grpcProcess, id string, serverAddress string
 
 	workDir, err := filepath.Abs(filepath.Dir(grpcProcess))
 	if err != nil {
-		return err
+		return nil, err
 	}
 
 	grpcControlProcess := process.New(
@@ -79,10 +84,8 @@ func (ml *ModelLoader) startProcess(grpcProcess, id string, serverAddress string
 		ml.wd.AddAddressModelMap(serverAddress, id)
 	}
 
-	ml.grpcProcesses[id] = grpcControlProcess
-
 	if err := grpcControlProcess.Run(); err != nil {
-		return err
+		return grpcControlProcess, err
 	}
 
 	log.Debug().Msgf("GRPC Service state dir: %s", grpcControlProcess.StateDir())
@@ -116,5 +119,5 @@ func (ml *ModelLoader) startProcess(grpcProcess, id string, serverAddress string
 		}
 	}()
 
-	return nil
+	return grpcControlProcess, nil
 }

From b0f4556c0f4277fc4056c396e4c639f7b41ea952 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Thu, 26 Sep 2024 14:52:26 +0200
Subject: [PATCH 094/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `ea9c32be71b91b42ecc538bd902e93cbb5fb36cb` (#3667)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 4efee986..3a90463b 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=70392f1f81470607ba3afef04aa56c9f65587664
+CPPLLAMA_VERSION?=ea9c32be71b91b42ecc538bd902e93cbb5fb36cb
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From 8c4196faf34a123f018471890873403bec33b702 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Thu, 26 Sep 2024 15:58:17 +0200
Subject: [PATCH 095/122] chore: :arrow_up: Update ggerganov/whisper.cpp to
 `69339af2d104802f3f201fd419163defba52890e` (#3666)

:arrow_up: Update ggerganov/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 3a90463b..07fd6ee3 100644
--- a/Makefile
+++ b/Makefile
@@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6
 
 # whisper.cpp version
 WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp
-WHISPER_CPP_VERSION?=0d2e2aed80109e8696791083bde3b58e190b7812
+WHISPER_CPP_VERSION?=69339af2d104802f3f201fd419163defba52890e
 
 # bert.cpp version
 BERT_REPO?=https://github.com/go-skynet/go-bert.cpp

From f2ba1cfb01d738d61dd443589d2878d4643e4fe2 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Thu, 26 Sep 2024 23:41:45 +0200
Subject: [PATCH 096/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `95bc82fbc0df6d48cf66c857a4dda3d044f45ca2` (#3674)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 07fd6ee3..ab7532d3 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=ea9c32be71b91b42ecc538bd902e93cbb5fb36cb
+CPPLLAMA_VERSION?=95bc82fbc0df6d48cf66c857a4dda3d044f45ca2
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From 4550abbfcece4f1ae4e2162431e6cd772d7a92d4 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Fri, 27 Sep 2024 08:54:36 +0200
Subject: [PATCH 097/122] chore(model-gallery): :arrow_up: update checksum
 (#3675)

:arrow_up: Checksum updates in gallery/index.yaml

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 gallery/index.yaml | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index de38c3d5..4b668061 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -59,8 +59,8 @@
     - filename: llama-3.2-1b-instruct-q8_0.gguf
       sha256: ba345c83bf5cc679c653b853c46517eea5a34f03ed2205449db77184d9ae62a9
       uri: huggingface://hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF/llama-3.2-1b-instruct-q8_0.gguf
-## Qwen2.5
 - &qwen25
+  ## Qwen2.5
   name: "qwen2.5-14b-instruct"
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
   license: apache-2.0
@@ -89,11 +89,11 @@
     - https://huggingface.co/bartowski/Qwen2.5-Math-7B-Instruct-GGUF
     - https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct
   description: |
-      In August 2024, we released the first series of mathematical LLMs - Qwen2-Math - of our Qwen family. A month later, we have upgraded it and open-sourced Qwen2.5-Math series, including base models Qwen2.5-Math-1.5B/7B/72B, instruction-tuned models Qwen2.5-Math-1.5B/7B/72B-Instruct, and mathematical reward model Qwen2.5-Math-RM-72B.
+    In August 2024, we released the first series of mathematical LLMs - Qwen2-Math - of our Qwen family. A month later, we have upgraded it and open-sourced Qwen2.5-Math series, including base models Qwen2.5-Math-1.5B/7B/72B, instruction-tuned models Qwen2.5-Math-1.5B/7B/72B-Instruct, and mathematical reward model Qwen2.5-Math-RM-72B.
 
-      Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT.
+    Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT.
 
-      The base models of Qwen2-Math are initialized with Qwen2-1.5B/7B/72B, and then pretrained on a meticulously designed Mathematics-specific Corpus. This corpus contains large-scale high-quality mathematical web texts, books, codes, exam questions, and mathematical pre-training data synthesized by Qwen2.
+    The base models of Qwen2-Math are initialized with Qwen2-1.5B/7B/72B, and then pretrained on a meticulously designed Mathematics-specific Corpus. This corpus contains large-scale high-quality mathematical web texts, books, codes, exam questions, and mathematical pre-training data synthesized by Qwen2.
   overrides:
     parameters:
       model: Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf
@@ -195,8 +195,8 @@
       model: Qwen2.5-32B.Q4_K_M.gguf
   files:
     - filename: Qwen2.5-32B.Q4_K_M.gguf
-      sha256: 02703e27c8b964db445444581a6937ad7538f0c32a100b26b49fa0e8ff527155
       uri: huggingface://mradermacher/Qwen2.5-32B-GGUF/Qwen2.5-32B.Q4_K_M.gguf
+      sha256: fa42a4067e3630929202b6bb1ef5cebc43c1898494aedfd567b7d53c7a9d84a6
 - !!merge <<: *qwen25
   name: "qwen2.5-32b-instruct"
   urls:
@@ -221,8 +221,8 @@
     - filename: Qwen2.5-72B-Instruct-Q4_K_M.gguf
       sha256: e4c8fad16946be8cf0bbf67eb8f4e18fc7415a5a6d2854b4cda453edb4082545
       uri: huggingface://bartowski/Qwen2.5-72B-Instruct-GGUF/Qwen2.5-72B-Instruct-Q4_K_M.gguf
-## SmolLM
 - &smollm
+  ## SmolLM
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
   name: "smollm-1.7b-instruct"
   icon: https://huggingface.co/datasets/HuggingFaceTB/images/resolve/main/banner_smol.png
@@ -651,9 +651,9 @@
     - https://huggingface.co/leafspark/Reflection-Llama-3.1-70B-bf16
     - https://huggingface.co/senseable/Reflection-Llama-3.1-70B-gguf
   description: |
-      Reflection Llama-3.1 70B is (currently) the world's top open-source LLM, trained with a new technique called Reflection-Tuning that teaches a LLM to detect mistakes in its reasoning and correct course.
+    Reflection Llama-3.1 70B is (currently) the world's top open-source LLM, trained with a new technique called Reflection-Tuning that teaches a LLM to detect mistakes in its reasoning and correct course.
 
-      The model was trained on synthetic data generated by Glaive. If you're training a model, Glaive is incredible — use them.
+    The model was trained on synthetic data generated by Glaive. If you're training a model, Glaive is incredible — use them.
   overrides:
     parameters:
       model: Reflection-Llama-3.1-70B-q4_k_m.gguf
@@ -973,15 +973,15 @@
     - https://huggingface.co/Sao10K/L3.1-8B-Niitama-v1.1
     - https://huggingface.co/Lewdiculous/L3.1-8B-Niitama-v1.1-GGUF-IQ-Imatrix
   description: |
-   GGUF-IQ-Imatrix quants for Sao10K/L3.1-8B-Niitama-v1.1
-   Here's the subjectively superior L3 version: L3-8B-Niitama-v1
-   An experimental model using experimental methods.
+    GGUF-IQ-Imatrix quants for Sao10K/L3.1-8B-Niitama-v1.1
+    Here's the subjectively superior L3 version: L3-8B-Niitama-v1
+    An experimental model using experimental methods.
 
-   More detail on it:
+    More detail on it:
 
-   Tamamo and Niitama are made from the same data. Literally. The only thing that's changed is how theyre shuffled and formatted. Yet, I get wildly different results.
+    Tamamo and Niitama are made from the same data. Literally. The only thing that's changed is how theyre shuffled and formatted. Yet, I get wildly different results.
 
-   Interesting, eh? Feels kinda not as good compared to the l3 version, but it's aight.
+    Interesting, eh? Feels kinda not as good compared to the l3 version, but it's aight.
   overrides:
     parameters:
       model: L3.1-8B-Niitama-v1.1-Q4_K_M-imat.gguf
@@ -1606,8 +1606,8 @@
   urls:
     - https://huggingface.co/Lewdiculous/MN-12B-Lyra-v4-GGUF-IQ-Imatrix
   description: |
-      A finetune of Mistral Nemo by Sao10K.
-      Uses the ChatML prompt format.
+    A finetune of Mistral Nemo by Sao10K.
+    Uses the ChatML prompt format.
   overrides:
     parameters:
       model: MN-12B-Lyra-v4-Q4_K_M-imat.gguf
@@ -2134,7 +2134,7 @@
     - https://huggingface.co/EpistemeAI/Athena-codegemma-2-2b-it
     - https://huggingface.co/mradermacher/Athena-codegemma-2-2b-it-GGUF
   description: |
-      Supervised fine tuned (sft unsloth) for coding with EpistemeAI coding dataset.
+    Supervised fine tuned (sft unsloth) for coding with EpistemeAI coding dataset.
   overrides:
     parameters:
       model: Athena-codegemma-2-2b-it.Q4_K_M.gguf

From 453c45d022c7f211279f3d30cf519520636dd7be Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 27 Sep 2024 12:21:04 +0200
Subject: [PATCH 098/122] models(gallery): add magnusintellectus-12b-v1-i1
 (#3678)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 4b668061..1a1828f6 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -1615,6 +1615,27 @@
     - filename: MN-12B-Lyra-v4-Q4_K_M-imat.gguf
       sha256: 1989123481ca1936c8a2cbe278ff5d1d2b0ae63dbdc838bb36a6d7547b8087b3
       uri: huggingface://Lewdiculous/MN-12B-Lyra-v4-GGUF-IQ-Imatrix/MN-12B-Lyra-v4-Q4_K_M-imat.gguf
+- !!merge <<: *mistral03
+  name: "magnusintellectus-12b-v1-i1"
+  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
+  icon: https://cdn-uploads.huggingface.co/production/uploads/66b564058d9afb7a9d5607d5/hUVJI1Qa4tCMrZWMgYkoD.png
+  urls:
+    - https://huggingface.co/GalrionSoftworks/MagnusIntellectus-12B-v1
+    - https://huggingface.co/mradermacher/MagnusIntellectus-12B-v1-i1-GGUF
+  description: |
+    How pleasant, the rocks appear to have made a decent conglomerate. A-.
+
+    MagnusIntellectus is a merge of the following models using LazyMergekit:
+
+        UsernameJustAnother/Nemo-12B-Marlin-v5
+        anthracite-org/magnum-12b-v2
+  overrides:
+    parameters:
+      model: MagnusIntellectus-12B-v1.i1-Q4_K_M.gguf
+  files:
+    - filename: MagnusIntellectus-12B-v1.i1-Q4_K_M.gguf
+      sha256: c97107983b4edc5b6f2a592d227ca2dd4196e2af3d3bc0fe6b7a8954a1fb5870
+      uri: huggingface://mradermacher/MagnusIntellectus-12B-v1-i1-GGUF/MagnusIntellectus-12B-v1.i1-Q4_K_M.gguf
 - &mudler
   ### START mudler's LocalAI specific-models
   url: "github:mudler/LocalAI/gallery/mudler.yaml@master"

From 2a8cbad12222f59295911078e9acc3788e666f36 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Fri, 27 Sep 2024 13:03:41 +0200
Subject: [PATCH 099/122] models(gallery): add bigqwen2.5-52b-instruct (#3679)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 1a1828f6..847e004c 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -221,6 +221,22 @@
     - filename: Qwen2.5-72B-Instruct-Q4_K_M.gguf
       sha256: e4c8fad16946be8cf0bbf67eb8f4e18fc7415a5a6d2854b4cda453edb4082545
       uri: huggingface://bartowski/Qwen2.5-72B-Instruct-GGUF/Qwen2.5-72B-Instruct-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "bigqwen2.5-52b-instruct"
+  icon: https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/98GiKtmH1AtHHbIbOUH4Y.jpeg
+  urls:
+    - https://huggingface.co/mlabonne/BigQwen2.5-52B-Instruct
+    - https://huggingface.co/bartowski/BigQwen2.5-52B-Instruct-GGUF
+  description: |
+    BigQwen2.5-52B-Instruct is a Qwen/Qwen2-32B-Instruct self-merge made with MergeKit.
+    It applies the mlabonne/Meta-Llama-3-120B-Instruct recipe.
+  overrides:
+    parameters:
+      model: BigQwen2.5-52B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: BigQwen2.5-52B-Instruct-Q4_K_M.gguf
+      sha256: 9c939f08e366b51b07096eb2ecb5cc2a82894ac7baf639e446237ad39889c896
+      uri: huggingface://bartowski/BigQwen2.5-52B-Instruct-GGUF/BigQwen2.5-52B-Instruct-Q4_K_M.gguf
 - &smollm
   ## SmolLM
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"

From 4e0f3cc9802e56fae2a52715298257932e3c0f5e Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Sat, 28 Sep 2024 00:42:59 +0200
Subject: [PATCH 100/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `b5de3b74a595cbfefab7eeb5a567425c6a9690cf` (#3681)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index ab7532d3..2c7310d8 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=95bc82fbc0df6d48cf66c857a4dda3d044f45ca2
+CPPLLAMA_VERSION?=b5de3b74a595cbfefab7eeb5a567425c6a9690cf
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From e94a50e9db24aa03ce0d53a5200099aadb52b3aa Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Sat, 28 Sep 2024 10:02:19 +0200
Subject: [PATCH 101/122] chore: :arrow_up: Update ggerganov/whisper.cpp to
 `8feb375fbdf0277ad36958c218c6bf48fa0ba75a` (#3680)

:arrow_up: Update ggerganov/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 2c7310d8..aa926f4c 100644
--- a/Makefile
+++ b/Makefile
@@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6
 
 # whisper.cpp version
 WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp
-WHISPER_CPP_VERSION?=69339af2d104802f3f201fd419163defba52890e
+WHISPER_CPP_VERSION?=8feb375fbdf0277ad36958c218c6bf48fa0ba75a
 
 # bert.cpp version
 BERT_REPO?=https://github.com/go-skynet/go-bert.cpp

From 50a3b54e3474fd552b352222a90f70c8ab624ceb Mon Sep 17 00:00:00 2001
From: siddimore <siddimore@gmail.com>
Date: Sat, 28 Sep 2024 08:23:56 -0700
Subject: [PATCH 102/122] feat(api): add correlationID to Track Chat requests
 (#3668)

* Add CorrelationID to chat request

Signed-off-by: Siddharth More <siddimore@gmail.com>

* remove get_token_metrics

Signed-off-by: Siddharth More <siddimore@gmail.com>

* Add CorrelationID to proto

Signed-off-by: Siddharth More <siddimore@gmail.com>

* fix correlation method name

Signed-off-by: Siddharth More <siddimore@gmail.com>

* Update core/http/endpoints/openai/chat.go

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Siddharth More <siddimore@gmail.com>

* Update core/http/endpoints/openai/chat.go

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Siddharth More <siddimore@gmail.com>

---------

Signed-off-by: Siddharth More <siddimore@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---
 backend/backend.proto                    |  1 +
 backend/cpp/llama/grpc-server.cpp        | 14 ++++++++++++++
 core/http/endpoints/openai/chat.go       |  7 +++++++
 core/http/endpoints/openai/completion.go |  2 ++
 core/http/endpoints/openai/request.go    | 13 ++++++++++++-
 5 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/backend/backend.proto b/backend/backend.proto
index 31bd63e5..b2d4518e 100644
--- a/backend/backend.proto
+++ b/backend/backend.proto
@@ -136,6 +136,7 @@ message PredictOptions {
   repeated Message Messages = 44;
   repeated string Videos = 45;
   repeated string Audios = 46;
+  string CorrelationId = 47;
 }
 
 // The response message containing the result
diff --git a/backend/cpp/llama/grpc-server.cpp b/backend/cpp/llama/grpc-server.cpp
index 56d59d21..791612db 100644
--- a/backend/cpp/llama/grpc-server.cpp
+++ b/backend/cpp/llama/grpc-server.cpp
@@ -2106,6 +2106,9 @@ json parse_options(bool streaming, const backend::PredictOptions* predict, llama
     data["ignore_eos"] = predict->ignoreeos();
     data["embeddings"] = predict->embeddings();
 
+    // Add the correlationid to json data
+    data["correlation_id"] = predict->correlationid();
+
     // for each image in the request, add the image data
     //
     for (int i = 0; i < predict->images_size(); i++) {
@@ -2344,6 +2347,11 @@ public:
                 int32_t tokens_evaluated = result.result_json.value("tokens_evaluated", 0);
                 reply.set_prompt_tokens(tokens_evaluated);
 
+                // Log Request Correlation Id
+                LOG_VERBOSE("correlation:", {
+                    { "id", data["correlation_id"] }
+                });
+
                 // Send the reply
                 writer->Write(reply);
 
@@ -2367,6 +2375,12 @@ public:
         std::string completion_text;
         task_result result = llama.queue_results.recv(task_id);
         if (!result.error && result.stop) {
+            
+            // Log Request Correlation Id
+            LOG_VERBOSE("correlation:", {
+                { "id", data["correlation_id"] }
+            });
+
             completion_text = result.result_json.value("content", "");
             int32_t tokens_predicted = result.result_json.value("tokens_predicted", 0);
             int32_t tokens_evaluated = result.result_json.value("tokens_evaluated", 0);
diff --git a/core/http/endpoints/openai/chat.go b/core/http/endpoints/openai/chat.go
index b937120a..1ac1387e 100644
--- a/core/http/endpoints/openai/chat.go
+++ b/core/http/endpoints/openai/chat.go
@@ -161,6 +161,12 @@ func ChatEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, startup
 		textContentToReturn = ""
 		id = uuid.New().String()
 		created = int(time.Now().Unix())
+		// Set CorrelationID
+		correlationID := c.Get("X-Correlation-ID")
+		if len(strings.TrimSpace(correlationID)) == 0 {
+			correlationID = id
+		}
+		c.Set("X-Correlation-ID", correlationID)
 
 		modelFile, input, err := readRequest(c, cl, ml, startupOptions, true)
 		if err != nil {
@@ -444,6 +450,7 @@ func ChatEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, startup
 			c.Set("Cache-Control", "no-cache")
 			c.Set("Connection", "keep-alive")
 			c.Set("Transfer-Encoding", "chunked")
+			c.Set("X-Correlation-ID", id)
 
 			responses := make(chan schema.OpenAIResponse)
 
diff --git a/core/http/endpoints/openai/completion.go b/core/http/endpoints/openai/completion.go
index b087cc5f..e5de1b3f 100644
--- a/core/http/endpoints/openai/completion.go
+++ b/core/http/endpoints/openai/completion.go
@@ -57,6 +57,8 @@ func CompletionEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, a
 	}
 
 	return func(c *fiber.Ctx) error {
+		// Add Correlation
+		c.Set("X-Correlation-ID", id)
 		modelFile, input, err := readRequest(c, cl, ml, appConfig, true)
 		if err != nil {
 			return fmt.Errorf("failed reading parameters from request:%w", err)
diff --git a/core/http/endpoints/openai/request.go b/core/http/endpoints/openai/request.go
index e24dd28f..d6182a39 100644
--- a/core/http/endpoints/openai/request.go
+++ b/core/http/endpoints/openai/request.go
@@ -6,6 +6,7 @@ import (
 	"fmt"
 
 	"github.com/gofiber/fiber/v2"
+	"github.com/google/uuid"
 	"github.com/mudler/LocalAI/core/config"
 	fiberContext "github.com/mudler/LocalAI/core/http/ctx"
 	"github.com/mudler/LocalAI/core/schema"
@@ -15,6 +16,11 @@ import (
 	"github.com/rs/zerolog/log"
 )
 
+type correlationIDKeyType string
+
+// CorrelationIDKey to track request across process boundary
+const CorrelationIDKey correlationIDKeyType = "correlationID"
+
 func readRequest(c *fiber.Ctx, cl *config.BackendConfigLoader, ml *model.ModelLoader, o *config.ApplicationConfig, firstModel bool) (string, *schema.OpenAIRequest, error) {
 	input := new(schema.OpenAIRequest)
 
@@ -24,9 +30,14 @@ func readRequest(c *fiber.Ctx, cl *config.BackendConfigLoader, ml *model.ModelLo
 	}
 
 	received, _ := json.Marshal(input)
+	// Extract or generate the correlation ID
+	correlationID := c.Get("X-Correlation-ID", uuid.New().String())
 
 	ctx, cancel := context.WithCancel(o.Context)
-	input.Context = ctx
+	// Add the correlation ID to the new context
+	ctxWithCorrelationID := context.WithValue(ctx, CorrelationIDKey, correlationID)
+
+	input.Context = ctxWithCorrelationID
 	input.Cancel = cancel
 
 	log.Debug().Msgf("Request received: %s", string(received))

From 1689740269ef97e9778d75e78ae4d844520a113c Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Sun, 29 Sep 2024 20:39:39 +0200
Subject: [PATCH 103/122] models(gallery): add replete-llm-v2.5-qwen-14b
 (#3688)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 847e004c..7701efd5 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -237,6 +237,23 @@
     - filename: BigQwen2.5-52B-Instruct-Q4_K_M.gguf
       sha256: 9c939f08e366b51b07096eb2ecb5cc2a82894ac7baf639e446237ad39889c896
       uri: huggingface://bartowski/BigQwen2.5-52B-Instruct-GGUF/BigQwen2.5-52B-Instruct-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "replete-llm-v2.5-qwen-14b"
+  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/ihnWXDEgV-ZKN_B036U1J.png
+  urls:
+    - https://huggingface.co/Replete-AI/Replete-LLM-V2.5-Qwen-14b
+    - https://huggingface.co/bartowski/Replete-LLM-V2.5-Qwen-14b-GGUF
+  description: |
+    Replete-LLM-V2.5-Qwen-14b is a continues finetuned version of Qwen2.5-14B. I noticed recently that the Qwen team did not learn from my methods of continuous finetuning, the great benefits, and no downsides of it. So I took it upon myself to merge the instruct model with the base model myself using the Ties merge method
+
+    This version of the model shows higher performance than the original instruct and base models.
+  overrides:
+    parameters:
+      model: Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf
+  files:
+    - filename: Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf
+      sha256: 17d0792ff5e3062aecb965629f66e679ceb407e4542e8045993dcfe9e7e14d9d
+      uri: huggingface://bartowski/Replete-LLM-V2.5-Qwen-14b-GGUF/Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf
 - &smollm
   ## SmolLM
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"

From ad62156d548adaade746e9d702301da2c793d0b9 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Sun, 29 Sep 2024 22:47:26 +0200
Subject: [PATCH 104/122] models(gallery): add replete-llm-v2.5-qwen-7b (#3689)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 7701efd5..2ffbd05b 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -254,6 +254,24 @@
     - filename: Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf
       sha256: 17d0792ff5e3062aecb965629f66e679ceb407e4542e8045993dcfe9e7e14d9d
       uri: huggingface://bartowski/Replete-LLM-V2.5-Qwen-14b-GGUF/Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "replete-llm-v2.5-qwen-7b"
+  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/ihnWXDEgV-ZKN_B036U1J.png
+  urls:
+    - https://huggingface.co/Replete-AI/Replete-LLM-V2.5-Qwen-7b
+    - https://huggingface.co/bartowski/Replete-LLM-V2.5-Qwen-7b-GGUF
+  description: |
+    Replete-LLM-V2.5-Qwen-7b is a continues finetuned version of Qwen2.5-14B. I noticed recently that the Qwen team did not learn from my methods of continuous finetuning, the great benefits, and no downsides of it. So I took it upon myself to merge the instruct model with the base model myself using the Ties merge method
+
+    This version of the model shows higher performance than the original instruct and base models.
+  overrides:
+    parameters:
+      model: Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf
+  files:
+    - filename: Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf
+      sha256: 054d54972259c0398b4e0af3f408f608e1166837b1d7535d08fc440d1daf8639
+      uri: huggingface://bartowski/Replete-LLM-V2.5-Qwen-7b-GGUF/Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf
+
 - &smollm
   ## SmolLM
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"

From 6dfee995754fb3853e02d69c370c670d636f4294 Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Mon, 30 Sep 2024 09:09:18 +0200
Subject: [PATCH 105/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `c919d5db39c8a7fcb64737f008e4b105ee0acd20` (#3686)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index aa926f4c..8617363c 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=b5de3b74a595cbfefab7eeb5a567425c6a9690cf
+CPPLLAMA_VERSION?=c919d5db39c8a7fcb64737f008e4b105ee0acd20
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From 078942fc9f741a35a189f295d1b4fb4ed1e26400 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Mon, 30 Sep 2024 09:09:51 +0200
Subject: [PATCH 106/122] chore(deps): bump grpcio to 1.66.2 (#3690)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 backend/python/autogptq/requirements.txt              | 2 +-
 backend/python/bark/requirements.txt                  | 2 +-
 backend/python/common/template/requirements.txt       | 2 +-
 backend/python/coqui/requirements.txt                 | 2 +-
 backend/python/diffusers/requirements.txt             | 2 +-
 backend/python/exllama2/requirements.txt              | 2 +-
 backend/python/mamba/requirements.txt                 | 2 +-
 backend/python/openvoice/requirements-intel.txt       | 2 +-
 backend/python/openvoice/requirements.txt             | 2 +-
 backend/python/parler-tts/requirements.txt            | 2 +-
 backend/python/rerankers/requirements.txt             | 2 +-
 backend/python/sentencetransformers/requirements.txt  | 2 +-
 backend/python/transformers-musicgen/requirements.txt | 2 +-
 backend/python/transformers/requirements.txt          | 2 +-
 backend/python/vall-e-x/requirements.txt              | 2 +-
 backend/python/vllm/requirements.txt                  | 2 +-
 16 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/backend/python/autogptq/requirements.txt b/backend/python/autogptq/requirements.txt
index 150fcc1b..9cb6ce94 100644
--- a/backend/python/autogptq/requirements.txt
+++ b/backend/python/autogptq/requirements.txt
@@ -1,6 +1,6 @@
 accelerate
 auto-gptq==0.7.1
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 certifi
 transformers
\ No newline at end of file
diff --git a/backend/python/bark/requirements.txt b/backend/python/bark/requirements.txt
index 6404b98e..6e46924a 100644
--- a/backend/python/bark/requirements.txt
+++ b/backend/python/bark/requirements.txt
@@ -1,4 +1,4 @@
 bark==0.1.5
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 certifi
\ No newline at end of file
diff --git a/backend/python/common/template/requirements.txt b/backend/python/common/template/requirements.txt
index 21610c1c..540c0eb5 100644
--- a/backend/python/common/template/requirements.txt
+++ b/backend/python/common/template/requirements.txt
@@ -1,2 +1,2 @@
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
\ No newline at end of file
diff --git a/backend/python/coqui/requirements.txt b/backend/python/coqui/requirements.txt
index 2a91f2b9..29484f7d 100644
--- a/backend/python/coqui/requirements.txt
+++ b/backend/python/coqui/requirements.txt
@@ -1,4 +1,4 @@
 coqui-tts
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 certifi
\ No newline at end of file
diff --git a/backend/python/diffusers/requirements.txt b/backend/python/diffusers/requirements.txt
index 043c7aba..730e316f 100644
--- a/backend/python/diffusers/requirements.txt
+++ b/backend/python/diffusers/requirements.txt
@@ -1,5 +1,5 @@
 setuptools
-grpcio==1.66.1
+grpcio==1.66.2
 pillow
 protobuf
 certifi
diff --git a/backend/python/exllama2/requirements.txt b/backend/python/exllama2/requirements.txt
index 6fb018a0..e3db2b2f 100644
--- a/backend/python/exllama2/requirements.txt
+++ b/backend/python/exllama2/requirements.txt
@@ -1,4 +1,4 @@
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 certifi
 wheel
diff --git a/backend/python/mamba/requirements.txt b/backend/python/mamba/requirements.txt
index 8e1b0195..83ae4279 100644
--- a/backend/python/mamba/requirements.txt
+++ b/backend/python/mamba/requirements.txt
@@ -1,3 +1,3 @@
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 certifi
\ No newline at end of file
diff --git a/backend/python/openvoice/requirements-intel.txt b/backend/python/openvoice/requirements-intel.txt
index a9a4cc20..c568dab1 100644
--- a/backend/python/openvoice/requirements-intel.txt
+++ b/backend/python/openvoice/requirements-intel.txt
@@ -2,7 +2,7 @@
 intel-extension-for-pytorch
 torch
 optimum[openvino]
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 librosa==0.9.1
 faster-whisper==1.0.3
diff --git a/backend/python/openvoice/requirements.txt b/backend/python/openvoice/requirements.txt
index b38805be..6ee29ce4 100644
--- a/backend/python/openvoice/requirements.txt
+++ b/backend/python/openvoice/requirements.txt
@@ -1,4 +1,4 @@
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 librosa
 faster-whisper
diff --git a/backend/python/parler-tts/requirements.txt b/backend/python/parler-tts/requirements.txt
index 0da3da13..d7f36feb 100644
--- a/backend/python/parler-tts/requirements.txt
+++ b/backend/python/parler-tts/requirements.txt
@@ -1,4 +1,4 @@
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 certifi
 llvmlite==0.43.0
\ No newline at end of file
diff --git a/backend/python/rerankers/requirements.txt b/backend/python/rerankers/requirements.txt
index 8e1b0195..83ae4279 100644
--- a/backend/python/rerankers/requirements.txt
+++ b/backend/python/rerankers/requirements.txt
@@ -1,3 +1,3 @@
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 certifi
\ No newline at end of file
diff --git a/backend/python/sentencetransformers/requirements.txt b/backend/python/sentencetransformers/requirements.txt
index b9cb6061..40a387f1 100644
--- a/backend/python/sentencetransformers/requirements.txt
+++ b/backend/python/sentencetransformers/requirements.txt
@@ -1,4 +1,4 @@
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 certifi
 datasets
diff --git a/backend/python/transformers-musicgen/requirements.txt b/backend/python/transformers-musicgen/requirements.txt
index fb1119a9..a3f66651 100644
--- a/backend/python/transformers-musicgen/requirements.txt
+++ b/backend/python/transformers-musicgen/requirements.txt
@@ -1,4 +1,4 @@
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 scipy==1.14.0
 certifi
\ No newline at end of file
diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt
index b19c59c0..084cc034 100644
--- a/backend/python/transformers/requirements.txt
+++ b/backend/python/transformers/requirements.txt
@@ -1,4 +1,4 @@
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 certifi
 setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406
\ No newline at end of file
diff --git a/backend/python/vall-e-x/requirements.txt b/backend/python/vall-e-x/requirements.txt
index 8e1b0195..83ae4279 100644
--- a/backend/python/vall-e-x/requirements.txt
+++ b/backend/python/vall-e-x/requirements.txt
@@ -1,3 +1,3 @@
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 certifi
\ No newline at end of file
diff --git a/backend/python/vllm/requirements.txt b/backend/python/vllm/requirements.txt
index b9c192d5..8fb8a418 100644
--- a/backend/python/vllm/requirements.txt
+++ b/backend/python/vllm/requirements.txt
@@ -1,4 +1,4 @@
-grpcio==1.66.1
+grpcio==1.66.2
 protobuf
 certifi
 setuptools
\ No newline at end of file

From 58662db48eaecd2d39d65a0c229a47032a6833d6 Mon Sep 17 00:00:00 2001
From: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date: Mon, 30 Sep 2024 17:11:54 +0200
Subject: [PATCH 107/122] models(gallery): add calme-2.2-qwen2.5-72b-i1 (#3691)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---
 gallery/index.yaml | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/gallery/index.yaml b/gallery/index.yaml
index 2ffbd05b..0924e5cf 100644
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -271,7 +271,30 @@
     - filename: Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf
       sha256: 054d54972259c0398b4e0af3f408f608e1166837b1d7535d08fc440d1daf8639
       uri: huggingface://bartowski/Replete-LLM-V2.5-Qwen-7b-GGUF/Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "calme-2.2-qwen2.5-72b-i1"
+  icon: https://huggingface.co/MaziyarPanahi/calme-2.2-qwen2.5-72b/resolve/main/calme-2.webp
+  urls:
+    - https://huggingface.co/MaziyarPanahi/calme-2.2-qwen2.5-72b
+    - https://huggingface.co/mradermacher/calme-2.2-qwen2.5-72b-i1-GGUF
+  description: |
+      This model is a fine-tuned version of the powerful Qwen/Qwen2.5-72B-Instruct, pushing the boundaries of natural language understanding and generation even further. My goal was to create a versatile and robust model that excels across a wide range of benchmarks and real-world applications.
+      Use Cases
 
+      This model is suitable for a wide range of applications, including but not limited to:
+
+          Advanced question-answering systems
+          Intelligent chatbots and virtual assistants
+          Content generation and summarization
+          Code generation and analysis
+          Complex problem-solving and decision support
+  overrides:
+    parameters:
+      model: calme-2.2-qwen2.5-72b.i1-Q4_K_M.gguf
+  files:
+    - filename: calme-2.2-qwen2.5-72b.i1-Q4_K_M.gguf
+      sha256: 5fdfa599724d7c78502c477ced1d294e92781b91d3265bd0748fbf15a6fefde6
+      uri: huggingface://mradermacher/calme-2.2-qwen2.5-72b-i1-GGUF/calme-2.2-qwen2.5-72b.i1-Q4_K_M.gguf
 - &smollm
   ## SmolLM
   url: "github:mudler/LocalAI/gallery/chatml.yaml@master"

From d747f2c89bc71cf4ca57539d68472d7b9e3bf0f7 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 30 Sep 2024 21:08:16 +0000
Subject: [PATCH 108/122] chore(deps): Bump openai from 1.47.1 to 1.50.2 in
 /examples/langchain-chroma (#3697)

chore(deps): Bump openai in /examples/langchain-chroma

Bumps [openai](https://github.com/openai/openai-python) from 1.47.1 to 1.50.2.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.47.1...v1.50.2)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain-chroma/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt
index 19929482..b6404437 100644
--- a/examples/langchain-chroma/requirements.txt
+++ b/examples/langchain-chroma/requirements.txt
@@ -1,4 +1,4 @@
 langchain==0.3.0
-openai==1.47.1
+openai==1.50.2
 chromadb==0.5.7
 llama-index==0.11.12
\ No newline at end of file

From 164a9e972fed51dae394e01ffd59a9a04b6ee44a Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 01:37:30 +0000
Subject: [PATCH 109/122] chore(deps): Bump chromadb from 0.5.7 to 0.5.11 in
 /examples/langchain-chroma (#3696)

chore(deps): Bump chromadb in /examples/langchain-chroma

Bumps [chromadb](https://github.com/chroma-core/chroma) from 0.5.7 to 0.5.11.
- [Release notes](https://github.com/chroma-core/chroma/releases)
- [Changelog](https://github.com/chroma-core/chroma/blob/main/RELEASE_PROCESS.md)
- [Commits](https://github.com/chroma-core/chroma/compare/0.5.7...0.5.11)

---
updated-dependencies:
- dependency-name: chromadb
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain-chroma/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt
index b6404437..756a6bf3 100644
--- a/examples/langchain-chroma/requirements.txt
+++ b/examples/langchain-chroma/requirements.txt
@@ -1,4 +1,4 @@
 langchain==0.3.0
 openai==1.50.2
-chromadb==0.5.7
+chromadb==0.5.11
 llama-index==0.11.12
\ No newline at end of file

From 32de75c68326758eac7f714fc522eb65c36fde18 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 03:13:37 +0000
Subject: [PATCH 110/122] chore(deps): Bump langchain from 0.3.0 to 0.3.1 in
 /examples/langchain-chroma (#3694)

chore(deps): Bump langchain in /examples/langchain-chroma

Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.3.0 to 0.3.1.
- [Release notes](https://github.com/langchain-ai/langchain/releases)
- [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.3.0...langchain==0.3.1)

---
updated-dependencies:
- dependency-name: langchain
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain-chroma/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt
index 756a6bf3..fda5f9d8 100644
--- a/examples/langchain-chroma/requirements.txt
+++ b/examples/langchain-chroma/requirements.txt
@@ -1,4 +1,4 @@
-langchain==0.3.0
+langchain==0.3.1
 openai==1.50.2
 chromadb==0.5.11
 llama-index==0.11.12
\ No newline at end of file

From f19277b8e2bc148193650a26927f183bc106c50a Mon Sep 17 00:00:00 2001
From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com>
Date: Tue, 1 Oct 2024 08:47:48 +0200
Subject: [PATCH 111/122] chore: :arrow_up: Update ggerganov/llama.cpp to
 `6f1d9d71f4c568778a7637ff6582e6f6ba5fb9d3` (#3708)

:arrow_up: Update ggerganov/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 8617363c..6c6dbd21 100644
--- a/Makefile
+++ b/Makefile
@@ -8,7 +8,7 @@ DETECT_LIBS?=true
 # llama.cpp versions
 GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp
 GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=c919d5db39c8a7fcb64737f008e4b105ee0acd20
+CPPLLAMA_VERSION?=6f1d9d71f4c568778a7637ff6582e6f6ba5fb9d3
 
 # go-rwkv version
 RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp

From 2908ff3f6b7a63fcd89e0cf5571c0409257209ac Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 08:50:40 +0200
Subject: [PATCH 112/122] chore(deps): Bump securego/gosec from 2.21.0 to
 2.21.4 (#3698)

Bumps [securego/gosec](https://github.com/securego/gosec) from 2.21.0 to 2.21.4.
- [Release notes](https://github.com/securego/gosec/releases)
- [Changelog](https://github.com/securego/gosec/blob/master/.goreleaser.yml)
- [Commits](https://github.com/securego/gosec/compare/v2.21.0...v2.21.4)

---
updated-dependencies:
- dependency-name: securego/gosec
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 .github/workflows/secscan.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/secscan.yaml b/.github/workflows/secscan.yaml
index db9db586..3fd808e1 100644
--- a/.github/workflows/secscan.yaml
+++ b/.github/workflows/secscan.yaml
@@ -18,7 +18,7 @@ jobs:
         if: ${{ github.actor != 'dependabot[bot]' }}
       - name: Run Gosec Security Scanner
         if: ${{ github.actor != 'dependabot[bot]' }}
-        uses: securego/gosec@v2.21.0
+        uses: securego/gosec@v2.21.4
         with:
           # we let the report trigger content trigger a failure using the GitHub Security features.
           args: '-no-fail -fmt sarif -out results.sarif ./...'

From 6bd6e2bdeb74e52a291052c4c8b808178ed40d90 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 08:51:07 +0200
Subject: [PATCH 113/122] chore(deps): Bump openai from 1.47.1 to 1.50.2 in
 /examples/functions (#3699)

Bumps [openai](https://github.com/openai/openai-python) from 1.47.1 to 1.50.2.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/openai/openai-python/compare/v1.47.1...v1.50.2)

---
updated-dependencies:
- dependency-name: openai
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/functions/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt
index c3ffad01..9ad014fd 100644
--- a/examples/functions/requirements.txt
+++ b/examples/functions/requirements.txt
@@ -1,2 +1,2 @@
 langchain==0.3.0
-openai==1.47.1
+openai==1.50.2

From 44bdacac61a319992f3bf3a32f756a65862617ed Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 08:51:29 +0200
Subject: [PATCH 114/122] chore(deps): Bump langchain from 0.3.0 to 0.3.1 in
 /examples/langchain/langchainpy-localai-example (#3704)

chore(deps): Bump langchain

Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.3.0 to 0.3.1.
- [Release notes](https://github.com/langchain-ai/langchain/releases)
- [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.3.0...langchain==0.3.1)

---
updated-dependencies:
- dependency-name: langchain
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index 179abc2a..daa467c7 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -10,7 +10,7 @@ debugpy==1.8.2
 frozenlist==1.4.1
 greenlet==3.1.0
 idna==3.10
-langchain==0.3.0
+langchain==0.3.1
 langchain-community==0.2.16
 marshmallow==3.22.0
 marshmallow-enum==1.5.1

From 7d306c6431ddba153704e5513e716288c7d73d09 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 10:39:55 +0200
Subject: [PATCH 115/122] chore(deps): Bump greenlet from 3.1.0 to 3.1.1 in
 /examples/langchain/langchainpy-localai-example (#3703)

chore(deps): Bump greenlet

Bumps [greenlet](https://github.com/python-greenlet/greenlet) from 3.1.0 to 3.1.1.
- [Changelog](https://github.com/python-greenlet/greenlet/blob/master/CHANGES.rst)
- [Commits](https://github.com/python-greenlet/greenlet/compare/3.1.0...3.1.1)

---
updated-dependencies:
- dependency-name: greenlet
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index daa467c7..205c726c 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -8,7 +8,7 @@ colorama==0.4.6
 dataclasses-json==0.6.7
 debugpy==1.8.2
 frozenlist==1.4.1
-greenlet==3.1.0
+greenlet==3.1.1
 idna==3.10
 langchain==0.3.1
 langchain-community==0.2.16

From d4d2a76f8f4b1379c2c554c911ba64ec1bbda389 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 10:40:08 +0200
Subject: [PATCH 116/122] chore(deps): Bump langchain from 0.3.0 to 0.3.1 in
 /examples/functions (#3700)

Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.3.0 to 0.3.1.
- [Release notes](https://github.com/langchain-ai/langchain/releases)
- [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.3.0...langchain==0.3.1)

---
updated-dependencies:
- dependency-name: langchain
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/functions/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt
index 9ad014fd..952f9d62 100644
--- a/examples/functions/requirements.txt
+++ b/examples/functions/requirements.txt
@@ -1,2 +1,2 @@
-langchain==0.3.0
+langchain==0.3.1
 openai==1.50.2

From 76d4e88e0c21b245e74e8ba6e15a5d937d1fdfb0 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 10:40:21 +0200
Subject: [PATCH 117/122] chore(deps): Bump langchain-community from 0.2.16 to
 0.3.1 in /examples/langchain/langchainpy-localai-example (#3702)

chore(deps): Bump langchain-community

Bumps [langchain-community](https://github.com/langchain-ai/langchain) from 0.2.16 to 0.3.1.
- [Release notes](https://github.com/langchain-ai/langchain/releases)
- [Commits](https://github.com/langchain-ai/langchain/compare/langchain-community==0.2.16...langchain-community==0.3.1)

---
updated-dependencies:
- dependency-name: langchain-community
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index 205c726c..b5f3960e 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -11,7 +11,7 @@ frozenlist==1.4.1
 greenlet==3.1.1
 idna==3.10
 langchain==0.3.1
-langchain-community==0.2.16
+langchain-community==0.3.1
 marshmallow==3.22.0
 marshmallow-enum==1.5.1
 multidict==6.0.5

From 0a8f627cce98be2c4469309b5a54b45b97930b63 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 10:40:36 +0200
Subject: [PATCH 118/122] chore(deps): Bump gradio from 4.38.1 to 4.44.1 in
 /backend/python/openvoice (#3701)

chore(deps): Bump gradio in /backend/python/openvoice

Bumps [gradio](https://github.com/gradio-app/gradio) from 4.38.1 to 4.44.1.
- [Release notes](https://github.com/gradio-app/gradio/releases)
- [Changelog](https://github.com/gradio-app/gradio/blob/main/CHANGELOG.md)
- [Commits](https://github.com/gradio-app/gradio/compare/gradio@4.38.1...gradio@4.44.1)

---
updated-dependencies:
- dependency-name: gradio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 backend/python/openvoice/requirements-intel.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/backend/python/openvoice/requirements-intel.txt b/backend/python/openvoice/requirements-intel.txt
index c568dab1..687efe78 100644
--- a/backend/python/openvoice/requirements-intel.txt
+++ b/backend/python/openvoice/requirements-intel.txt
@@ -18,6 +18,6 @@ python-dotenv
 pypinyin==0.50.0
 cn2an==0.5.22
 jieba==0.42.1
-gradio==4.38.1
+gradio==4.44.1
 langid==1.1.6
 git+https://github.com/myshell-ai/MeloTTS.git

From 2649407f44cf7c1c822fb671c6501ec899d1fc6f Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 10:40:49 +0200
Subject: [PATCH 119/122] chore(deps): Bump llama-index from 0.11.12 to 0.11.14
 in /examples/langchain-chroma (#3695)

chore(deps): Bump llama-index in /examples/langchain-chroma

Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.12 to 0.11.14.
- [Release notes](https://github.com/run-llama/llama_index/releases)
- [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md)
- [Commits](https://github.com/run-llama/llama_index/compare/v0.11.12...v0.11.14)

---
updated-dependencies:
- dependency-name: llama-index
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain-chroma/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt
index fda5f9d8..d84311b3 100644
--- a/examples/langchain-chroma/requirements.txt
+++ b/examples/langchain-chroma/requirements.txt
@@ -1,4 +1,4 @@
 langchain==0.3.1
 openai==1.50.2
 chromadb==0.5.11
-llama-index==0.11.12
\ No newline at end of file
+llama-index==0.11.14
\ No newline at end of file

From 53f406dc35485df76450f65ba11ba548cb86f196 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 10:41:04 +0200
Subject: [PATCH 120/122] chore(deps): Bump aiohttp from 3.10.3 to 3.10.8 in
 /examples/langchain/langchainpy-localai-example (#3705)

chore(deps): Bump aiohttp

Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.10.3 to 3.10.8.
- [Release notes](https://github.com/aio-libs/aiohttp/releases)
- [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst)
- [Commits](https://github.com/aio-libs/aiohttp/compare/v3.10.3...v3.10.8)

---
updated-dependencies:
- dependency-name: aiohttp
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index b5f3960e..53812966 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -1,4 +1,4 @@
-aiohttp==3.10.3
+aiohttp==3.10.8
 aiosignal==1.3.1
 async-timeout==4.0.3
 attrs==24.2.0

From a30058b80f1b23407188a689ec514385ccfa63f9 Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 10:41:16 +0200
Subject: [PATCH 121/122] chore(deps): Bump yarl from 1.11.1 to 1.13.1 in
 /examples/langchain/langchainpy-localai-example (#3706)

chore(deps): Bump yarl

Bumps [yarl](https://github.com/aio-libs/yarl) from 1.11.1 to 1.13.1.
- [Release notes](https://github.com/aio-libs/yarl/releases)
- [Changelog](https://github.com/aio-libs/yarl/blob/master/CHANGES.rst)
- [Commits](https://github.com/aio-libs/yarl/compare/v1.11.1...v1.13.1)

---
updated-dependencies:
- dependency-name: yarl
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/langchain/langchainpy-localai-example/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt
index 53812966..1d48dee8 100644
--- a/examples/langchain/langchainpy-localai-example/requirements.txt
+++ b/examples/langchain/langchainpy-localai-example/requirements.txt
@@ -30,4 +30,4 @@ tqdm==4.66.5
 typing-inspect==0.9.0
 typing_extensions==4.12.2
 urllib3==2.2.3
-yarl==1.11.1
+yarl==1.13.1

From 139209353f74100e495471dfbc41f9900a2212fd Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue, 1 Oct 2024 10:41:30 +0200
Subject: [PATCH 122/122] chore(deps): Bump llama-index from 0.11.12 to 0.11.14
 in /examples/chainlit (#3707)

chore(deps): Bump llama-index in /examples/chainlit

Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.11.12 to 0.11.14.
- [Release notes](https://github.com/run-llama/llama_index/releases)
- [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md)
- [Commits](https://github.com/run-llama/llama_index/compare/v0.11.12...v0.11.14)

---
updated-dependencies:
- dependency-name: llama-index
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
 examples/chainlit/requirements.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/chainlit/requirements.txt b/examples/chainlit/requirements.txt
index 92eb113e..ee6c63ac 100644
--- a/examples/chainlit/requirements.txt
+++ b/examples/chainlit/requirements.txt
@@ -1,4 +1,4 @@
-llama_index==0.11.12
+llama_index==0.11.14
 requests==2.32.3
 weaviate_client==4.8.1
 transformers