mirror of
https://github.com/mudler/LocalAI.git
synced 2025-02-22 10:00:47 +00:00
feat(conda): conda environments (#1144)
* feat(autogptq): add a separate conda environment for autogptq (#1137) **Description** This PR related to #1117 **Notes for Reviewers** Here we lock down the version of the dependencies. Make sure it can be used all the time without failed if the version of dependencies were upgraded. I change the order of importing packages according to the pylint, and no change the logic of code. It should be ok. I will do more investigate on writing some test cases for every backend. I can run the service in my environment, but there is not exist a way to test it. So, I am not confident on it. Add a README.md in the `grpc` root. This is the common commands for creating `conda` environment. And it can be used to the reference file for creating extral gRPC backend document. Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * [Extra backend] Add seperate environment for ttsbark (#1141) **Description** This PR relates to #1117 **Notes for Reviewers** Same to the latest PR: * The code is also changed, but only the order of the import package parts. And some code comments are also added. * Add a configuration of the `conda` environment * Add a simple test case for testing if the service can be startup in current `conda` environment. It is succeed in VSCode, but the it is not out of box on terminal. So, it is hard to say the test case really useful. **[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)** - [x] Yes, I signed my commits. <!-- Thank you for contributing to LocalAI! Contributing Conventions ------------------------- The draft above helps to give a quick overview of your PR. Remember to remove this comment and to at least: 1. Include descriptive PR titles with [<component-name>] prepended. We use [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/). 2. Build and test your changes before submitting a PR (`make build`). 3. Sign your commits 4. **Tag maintainer:** for a quicker response, tag the relevant maintainer (see below). 5. **X/Twitter handle:** we announce bigger features on X/Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! By following the community's contribution conventions upfront, the review process will be accelerated and your PR merged more quickly. If no one reviews your PR within a few days, please @-mention @mudler. --> Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(conda): add make target and entrypoints for the dockerfile Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(conda): Add seperate conda env for diffusers (#1145) **Description** This PR relates to #1117 **Notes for Reviewers** * Add `conda` env `diffusers.yml` * Add Makefile to create it automatically * Add `run.sh` to support running as a extra backend * Also adding it to the main Dockerfile * Add make command in the root Makefile * Testing the server, it can start up under the env Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(conda):Add seperate env for vllm (#1148) **Description** This PR is related to #1117 **Notes for Reviewers** * The gRPC server can be started as normal * The test case can be triggered in VSCode * Same to other this kind of PRs, add `vllm.yml` Makefile and add `run.sh` to the main Dockerfile, and command to the main Makefile **[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)** - [x] Yes, I signed my commits. <!-- Thank you for contributing to LocalAI! Contributing Conventions ------------------------- The draft above helps to give a quick overview of your PR. Remember to remove this comment and to at least: 1. Include descriptive PR titles with [<component-name>] prepended. We use [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/). 2. Build and test your changes before submitting a PR (`make build`). 3. Sign your commits 4. **Tag maintainer:** for a quicker response, tag the relevant maintainer (see below). 5. **X/Twitter handle:** we announce bigger features on X/Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! By following the community's contribution conventions upfront, the review process will be accelerated and your PR merged more quickly. If no one reviews your PR within a few days, please @-mention @mudler. --> Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(conda):Add seperate env for huggingface (#1146) **Description** This PR is related to #1117 **Notes for Reviewers** * Add conda env `huggingface.yml` * Change the import order, and also remove the no-used packages * Add `run.sh` and `make command` to the main Dockerfile and Makefile * Add test cases for it. It can be triggered and succeed under VSCode Python extension but it is hang by using `python -m unites test_huggingface.py` in the terminal ``` Running tests (unittest): /workspaces/LocalAI/extra/grpc/huggingface Running tests: /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_embedding /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_load_model /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_server_startup ./test_huggingface.py::TestBackendServicer::test_embedding Passed ./test_huggingface.py::TestBackendServicer::test_load_model Passed ./test_huggingface.py::TestBackendServicer::test_server_startup Passed Total number of tests expected to run: 3 Total number of tests run: 3 Total number of tests passed: 3 Total number of tests failed: 0 Total number of tests failed with errors: 0 Total number of tests skipped: 0 Finished running tests! ``` **[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)** - [x] Yes, I signed my commits. <!-- Thank you for contributing to LocalAI! Contributing Conventions ------------------------- The draft above helps to give a quick overview of your PR. Remember to remove this comment and to at least: 1. Include descriptive PR titles with [<component-name>] prepended. We use [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/). 2. Build and test your changes before submitting a PR (`make build`). 3. Sign your commits 4. **Tag maintainer:** for a quicker response, tag the relevant maintainer (see below). 5. **X/Twitter handle:** we announce bigger features on X/Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! By following the community's contribution conventions upfront, the review process will be accelerated and your PR merged more quickly. If no one reviews your PR within a few days, please @-mention @mudler. --> Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(conda): Add the seperate conda env for VALL-E X (#1147) **Description** This PR is related to #1117 **Notes for Reviewers** * The gRPC server cannot start up ``` (ttsvalle) @Aisuko ➜ /workspaces/LocalAI (feat/vall-e-x) $ /opt/conda/envs/ttsvalle/bin/python /workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py Traceback (most recent call last): File "/workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py", line 14, in <module> from utils.generation import SAMPLE_RATE, generate_audio, preload_models ModuleNotFoundError: No module named 'utils' ``` The installation steps follow https://github.com/Plachtaa/VALL-E-X#-installation below: * Under the `ttsvalle` conda env ``` git clone https://github.com/Plachtaa/VALL-E-X.git cd VALL-E-X pip install -r requirements.txt ``` **[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)** - [x] Yes, I signed my commits. <!-- Thank you for contributing to LocalAI! Contributing Conventions ------------------------- The draft above helps to give a quick overview of your PR. Remember to remove this comment and to at least: 1. Include descriptive PR titles with [<component-name>] prepended. We use [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/). 2. Build and test your changes before submitting a PR (`make build`). 3. Sign your commits 4. **Tag maintainer:** for a quicker response, tag the relevant maintainer (see below). 5. **X/Twitter handle:** we announce bigger features on X/Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! By following the community's contribution conventions upfront, the review process will be accelerated and your PR merged more quickly. If no one reviews your PR within a few days, please @-mention @mudler. --> Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: set image type Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(conda):Add seperate conda env for exllama (#1149) Add seperate env for exllama Signed-off-by: Aisuko <urakiny@gmail.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Setup conda Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Set image_type arg Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: prepare only conda env in tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Dockerfile: comment manual pip calls Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * conda: add conda to PATH Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixes * add shebang * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * file perms Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * debug * Install new conda in the worker * Disable GPU tests for now until the worker is back * Rename workflows * debug * Fixup conda install * fixup(wrapper): pass args Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Aisuko <urakiny@gmail.com> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: Aisuko <urakiny@gmail.com>
This commit is contained in:
parent
9b17af18b3
commit
f347e51927
136
.github/workflows/image.yml
vendored
136
.github/workflows/image.yml
vendored
@ -14,7 +14,7 @@ concurrency:
|
|||||||
cancel-in-progress: true
|
cancel-in-progress: true
|
||||||
|
|
||||||
jobs:
|
jobs:
|
||||||
docker:
|
image-build:
|
||||||
strategy:
|
strategy:
|
||||||
matrix:
|
matrix:
|
||||||
include:
|
include:
|
||||||
@ -29,98 +29,6 @@ jobs:
|
|||||||
tag-latest: 'false'
|
tag-latest: 'false'
|
||||||
tag-suffix: '-ffmpeg'
|
tag-suffix: '-ffmpeg'
|
||||||
ffmpeg: 'true'
|
ffmpeg: 'true'
|
||||||
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
steps:
|
|
||||||
- name: Checkout
|
|
||||||
uses: actions/checkout@v4
|
|
||||||
- name: Release space from worker
|
|
||||||
run: |
|
|
||||||
echo "Listing top largest packages"
|
|
||||||
pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
|
|
||||||
head -n 30 <<< "${pkgs}"
|
|
||||||
echo
|
|
||||||
df -h
|
|
||||||
echo
|
|
||||||
sudo apt-get remove -y '^llvm-.*|^libllvm.*' || true
|
|
||||||
sudo apt-get remove --auto-remove android-sdk-platform-tools || true
|
|
||||||
sudo apt-get purge --auto-remove android-sdk-platform-tools || true
|
|
||||||
sudo rm -rf /usr/local/lib/android
|
|
||||||
sudo apt-get remove -y '^dotnet-.*|^aspnetcore-.*' || true
|
|
||||||
sudo rm -rf /usr/share/dotnet
|
|
||||||
sudo apt-get remove -y '^mono-.*' || true
|
|
||||||
sudo apt-get remove -y '^ghc-.*' || true
|
|
||||||
sudo apt-get remove -y '.*jdk.*|.*jre.*' || true
|
|
||||||
sudo apt-get remove -y 'php.*' || true
|
|
||||||
sudo apt-get remove -y hhvm powershell firefox monodoc-manual msbuild || true
|
|
||||||
sudo apt-get remove -y '^google-.*' || true
|
|
||||||
sudo apt-get remove -y azure-cli || true
|
|
||||||
sudo apt-get remove -y '^mongo.*-.*|^postgresql-.*|^mysql-.*|^mssql-.*' || true
|
|
||||||
sudo apt-get remove -y '^gfortran-.*' || true
|
|
||||||
sudo apt-get remove -y microsoft-edge-stable || true
|
|
||||||
sudo apt-get remove -y firefox || true
|
|
||||||
sudo apt-get remove -y powershell || true
|
|
||||||
sudo apt-get remove -y r-base-core || true
|
|
||||||
sudo apt-get autoremove -y
|
|
||||||
sudo apt-get clean
|
|
||||||
echo
|
|
||||||
echo "Listing top largest packages"
|
|
||||||
pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
|
|
||||||
head -n 30 <<< "${pkgs}"
|
|
||||||
echo
|
|
||||||
sudo rm -rfv build || true
|
|
||||||
df -h
|
|
||||||
- name: Docker meta
|
|
||||||
id: meta
|
|
||||||
uses: docker/metadata-action@v5
|
|
||||||
with:
|
|
||||||
images: quay.io/go-skynet/local-ai
|
|
||||||
tags: |
|
|
||||||
type=ref,event=branch
|
|
||||||
type=semver,pattern={{raw}}
|
|
||||||
type=sha
|
|
||||||
flavor: |
|
|
||||||
latest=${{ matrix.tag-latest }}
|
|
||||||
suffix=${{ matrix.tag-suffix }}
|
|
||||||
|
|
||||||
- name: Set up QEMU
|
|
||||||
uses: docker/setup-qemu-action@master
|
|
||||||
with:
|
|
||||||
platforms: all
|
|
||||||
|
|
||||||
- name: Set up Docker Buildx
|
|
||||||
id: buildx
|
|
||||||
uses: docker/setup-buildx-action@master
|
|
||||||
|
|
||||||
- name: Login to DockerHub
|
|
||||||
if: github.event_name != 'pull_request'
|
|
||||||
uses: docker/login-action@v3
|
|
||||||
with:
|
|
||||||
registry: quay.io
|
|
||||||
username: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
|
|
||||||
password: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
|
|
||||||
|
|
||||||
- name: Build and push
|
|
||||||
uses: docker/build-push-action@v5
|
|
||||||
with:
|
|
||||||
builder: ${{ steps.buildx.outputs.name }}
|
|
||||||
build-args: |
|
|
||||||
BUILD_TYPE=${{ matrix.build-type }}
|
|
||||||
CUDA_MAJOR_VERSION=${{ matrix.cuda-major-version }}
|
|
||||||
CUDA_MINOR_VERSION=${{ matrix.cuda-minor-version }}
|
|
||||||
FFMPEG=${{ matrix.ffmpeg }}
|
|
||||||
context: .
|
|
||||||
file: ./Dockerfile
|
|
||||||
platforms: ${{ matrix.platforms }}
|
|
||||||
push: ${{ github.event_name != 'pull_request' }}
|
|
||||||
tags: ${{ steps.meta.outputs.tags }}
|
|
||||||
labels: ${{ steps.meta.outputs.labels }}
|
|
||||||
|
|
||||||
|
|
||||||
docker-gpu:
|
|
||||||
strategy:
|
|
||||||
matrix:
|
|
||||||
include:
|
|
||||||
- build-type: 'cublas'
|
- build-type: 'cublas'
|
||||||
cuda-major-version: 11
|
cuda-major-version: 11
|
||||||
cuda-minor-version: 7
|
cuda-minor-version: 7
|
||||||
@ -162,7 +70,42 @@ jobs:
|
|||||||
&& sudo apt-get install -y git
|
&& sudo apt-get install -y git
|
||||||
- name: Checkout
|
- name: Checkout
|
||||||
uses: actions/checkout@v4
|
uses: actions/checkout@v4
|
||||||
|
# - name: Release space from worker
|
||||||
|
# run: |
|
||||||
|
# echo "Listing top largest packages"
|
||||||
|
# pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
|
||||||
|
# head -n 30 <<< "${pkgs}"
|
||||||
|
# echo
|
||||||
|
# df -h
|
||||||
|
# echo
|
||||||
|
# sudo apt-get remove -y '^llvm-.*|^libllvm.*' || true
|
||||||
|
# sudo apt-get remove --auto-remove android-sdk-platform-tools || true
|
||||||
|
# sudo apt-get purge --auto-remove android-sdk-platform-tools || true
|
||||||
|
# sudo rm -rf /usr/local/lib/android
|
||||||
|
# sudo apt-get remove -y '^dotnet-.*|^aspnetcore-.*' || true
|
||||||
|
# sudo rm -rf /usr/share/dotnet
|
||||||
|
# sudo apt-get remove -y '^mono-.*' || true
|
||||||
|
# sudo apt-get remove -y '^ghc-.*' || true
|
||||||
|
# sudo apt-get remove -y '.*jdk.*|.*jre.*' || true
|
||||||
|
# sudo apt-get remove -y 'php.*' || true
|
||||||
|
# sudo apt-get remove -y hhvm powershell firefox monodoc-manual msbuild || true
|
||||||
|
# sudo apt-get remove -y '^google-.*' || true
|
||||||
|
# sudo apt-get remove -y azure-cli || true
|
||||||
|
# sudo apt-get remove -y '^mongo.*-.*|^postgresql-.*|^mysql-.*|^mssql-.*' || true
|
||||||
|
# sudo apt-get remove -y '^gfortran-.*' || true
|
||||||
|
# sudo apt-get remove -y microsoft-edge-stable || true
|
||||||
|
# sudo apt-get remove -y firefox || true
|
||||||
|
# sudo apt-get remove -y powershell || true
|
||||||
|
# sudo apt-get remove -y r-base-core || true
|
||||||
|
# sudo apt-get autoremove -y
|
||||||
|
# sudo apt-get clean
|
||||||
|
# echo
|
||||||
|
# echo "Listing top largest packages"
|
||||||
|
# pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
|
||||||
|
# head -n 30 <<< "${pkgs}"
|
||||||
|
# echo
|
||||||
|
# sudo rm -rfv build || true
|
||||||
|
# df -h
|
||||||
- name: Docker meta
|
- name: Docker meta
|
||||||
id: meta
|
id: meta
|
||||||
uses: docker/metadata-action@v5
|
uses: docker/metadata-action@v5
|
||||||
@ -192,6 +135,7 @@ jobs:
|
|||||||
registry: quay.io
|
registry: quay.io
|
||||||
username: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
|
username: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
|
||||||
password: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
|
password: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
|
||||||
|
|
||||||
- name: Build and push
|
- name: Build and push
|
||||||
uses: docker/build-push-action@v5
|
uses: docker/build-push-action@v5
|
||||||
with:
|
with:
|
||||||
@ -207,7 +151,3 @@ jobs:
|
|||||||
push: ${{ github.event_name != 'pull_request' }}
|
push: ${{ github.event_name != 'pull_request' }}
|
||||||
tags: ${{ steps.meta.outputs.tags }}
|
tags: ${{ steps.meta.outputs.tags }}
|
||||||
labels: ${{ steps.meta.outputs.labels }}
|
labels: ${{ steps.meta.outputs.labels }}
|
||||||
- name: Release space from worker ♻
|
|
||||||
if: always()
|
|
||||||
run: |
|
|
||||||
docker system prune -f -a --volumes || true
|
|
||||||
|
16
.github/workflows/test.yml
vendored
16
.github/workflows/test.yml
vendored
@ -14,7 +14,7 @@ concurrency:
|
|||||||
cancel-in-progress: true
|
cancel-in-progress: true
|
||||||
|
|
||||||
jobs:
|
jobs:
|
||||||
ubuntu-latest:
|
tests-linux:
|
||||||
runs-on: ubuntu-latest
|
runs-on: ubuntu-latest
|
||||||
strategy:
|
strategy:
|
||||||
matrix:
|
matrix:
|
||||||
@ -67,11 +67,18 @@ jobs:
|
|||||||
run: |
|
run: |
|
||||||
sudo apt-get update
|
sudo apt-get update
|
||||||
sudo apt-get install build-essential ffmpeg
|
sudo apt-get install build-essential ffmpeg
|
||||||
|
curl https://repo.anaconda.com/pkgs/misc/gpgkeys/anaconda.asc | gpg --dearmor > conda.gpg && \
|
||||||
|
sudo install -o root -g root -m 644 conda.gpg /usr/share/keyrings/conda-archive-keyring.gpg && \
|
||||||
|
gpg --keyring /usr/share/keyrings/conda-archive-keyring.gpg --no-default-keyring --fingerprint 34161F5BF5EB1D4BFBBB8F0A8AEB4F8B29D82806 && \
|
||||||
|
sudo /bin/bash -c 'echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" > /etc/apt/sources.list.d/conda.list' && \
|
||||||
|
sudo /bin/bash -c 'echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" | tee -a /etc/apt/sources.list.d/conda.list' && \
|
||||||
|
sudo apt-get update && \
|
||||||
|
sudo apt-get install -y conda
|
||||||
sudo apt-get install -y ca-certificates cmake curl patch
|
sudo apt-get install -y ca-certificates cmake curl patch
|
||||||
sudo apt-get install -y libopencv-dev && sudo ln -s /usr/include/opencv4/opencv2 /usr/include/opencv2
|
sudo apt-get install -y libopencv-dev && sudo ln -s /usr/include/opencv4/opencv2 /usr/include/opencv2
|
||||||
sudo pip install -r extra/requirements.txt
|
|
||||||
|
|
||||||
|
sudo rm -rfv /usr/bin/conda || true
|
||||||
|
PATH=$PATH:/opt/conda/bin make -C extra/grpc/huggingface
|
||||||
|
|
||||||
# Pre-build stable diffusion before we install a newever version of abseil (not compatible with stablediffusion-ncn)
|
# Pre-build stable diffusion before we install a newever version of abseil (not compatible with stablediffusion-ncn)
|
||||||
GO_TAGS="tts stablediffusion" GRPC_BACKENDS=backend-assets/grpc/stablediffusion make build
|
GO_TAGS="tts stablediffusion" GRPC_BACKENDS=backend-assets/grpc/stablediffusion make build
|
||||||
@ -96,12 +103,11 @@ jobs:
|
|||||||
cd grpc && mkdir -p cmake/build && cd cmake/build && cmake -DgRPC_INSTALL=ON \
|
cd grpc && mkdir -p cmake/build && cd cmake/build && cmake -DgRPC_INSTALL=ON \
|
||||||
-DgRPC_BUILD_TESTS=OFF \
|
-DgRPC_BUILD_TESTS=OFF \
|
||||||
../.. && sudo make -j12 install
|
../.. && sudo make -j12 install
|
||||||
|
|
||||||
- name: Test
|
- name: Test
|
||||||
run: |
|
run: |
|
||||||
ESPEAK_DATA="/build/lib/Linux-$(uname -m)/piper_phonemize/lib/espeak-ng-data" GO_TAGS="tts stablediffusion" make test
|
ESPEAK_DATA="/build/lib/Linux-$(uname -m)/piper_phonemize/lib/espeak-ng-data" GO_TAGS="tts stablediffusion" make test
|
||||||
|
|
||||||
macOS-latest:
|
tests-apple:
|
||||||
runs-on: macOS-latest
|
runs-on: macOS-latest
|
||||||
strategy:
|
strategy:
|
||||||
matrix:
|
matrix:
|
||||||
|
29
Dockerfile
29
Dockerfile
@ -14,7 +14,7 @@ ARG TARGETARCH
|
|||||||
ARG TARGETVARIANT
|
ARG TARGETVARIANT
|
||||||
|
|
||||||
ENV BUILD_TYPE=${BUILD_TYPE}
|
ENV BUILD_TYPE=${BUILD_TYPE}
|
||||||
ENV EXTERNAL_GRPC_BACKENDS="huggingface-embeddings:/build/extra/grpc/huggingface/huggingface.py,autogptq:/build/extra/grpc/autogptq/autogptq.py,bark:/build/extra/grpc/bark/ttsbark.py,diffusers:/build/extra/grpc/diffusers/backend_diffusers.py,exllama:/build/extra/grpc/exllama/exllama.py,vall-e-x:/build/extra/grpc/vall-e-x/ttsvalle.py,vllm:/build/extra/grpc/vllm/backend_vllm.py"
|
ENV EXTERNAL_GRPC_BACKENDS="huggingface-embeddings:/build/extra/grpc/huggingface/run.sh,autogptq:/build/extra/grpc/autogptq/run.sh,bark:/build/extra/grpc/bark/run.sh,diffusers:/build/extra/grpc/diffusers/run.sh,exllama:/build/extra/grpc/exllama/run.sh,vall-e-x:/build/extra/grpc/vall-e-x/run.sh,vllm:/build/extra/grpc/vllm/run.sh"
|
||||||
ENV GALLERIES='[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}, {"url": "github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]'
|
ENV GALLERIES='[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}, {"url": "github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]'
|
||||||
ARG GO_TAGS="stablediffusion tts"
|
ARG GO_TAGS="stablediffusion tts"
|
||||||
|
|
||||||
@ -77,17 +77,25 @@ RUN curl -L "https://github.com/gabime/spdlog/archive/refs/tags/v${SPDLOG_VERSIO
|
|||||||
# Extras requirements
|
# Extras requirements
|
||||||
FROM requirements-core as requirements-extras
|
FROM requirements-core as requirements-extras
|
||||||
|
|
||||||
|
RUN curl https://repo.anaconda.com/pkgs/misc/gpgkeys/anaconda.asc | gpg --dearmor > conda.gpg && \
|
||||||
|
install -o root -g root -m 644 conda.gpg /usr/share/keyrings/conda-archive-keyring.gpg && \
|
||||||
|
gpg --keyring /usr/share/keyrings/conda-archive-keyring.gpg --no-default-keyring --fingerprint 34161F5BF5EB1D4BFBBB8F0A8AEB4F8B29D82806 && \
|
||||||
|
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" > /etc/apt/sources.list.d/conda.list && \
|
||||||
|
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" | tee -a /etc/apt/sources.list.d/conda.list && \
|
||||||
|
apt-get update && \
|
||||||
|
apt-get install -y conda
|
||||||
|
|
||||||
COPY extra/requirements.txt /build/extra/requirements.txt
|
COPY extra/requirements.txt /build/extra/requirements.txt
|
||||||
ENV PATH="/root/.cargo/bin:${PATH}"
|
ENV PATH="/root/.cargo/bin:${PATH}"
|
||||||
RUN pip install --upgrade pip
|
RUN pip install --upgrade pip
|
||||||
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
|
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
|
||||||
RUN if [ "${TARGETARCH}" = "amd64" ]; then \
|
#RUN if [ "${TARGETARCH}" = "amd64" ]; then \
|
||||||
pip install git+https://github.com/suno-ai/bark.git diffusers invisible_watermark transformers accelerate safetensors;\
|
# pip install git+https://github.com/suno-ai/bark.git diffusers invisible_watermark transformers accelerate safetensors;\
|
||||||
fi
|
# fi
|
||||||
RUN if [ "${BUILD_TYPE}" = "cublas" ] && [ "${TARGETARCH}" = "amd64" ]; then \
|
#RUN if [ "${BUILD_TYPE}" = "cublas" ] && [ "${TARGETARCH}" = "amd64" ]; then \
|
||||||
pip install torch vllm && pip install auto-gptq https://github.com/jllllll/exllama/releases/download/0.0.10/exllama-0.0.10+cu${CUDA_MAJOR_VERSION}${CUDA_MINOR_VERSION}-cp39-cp39-linux_x86_64.whl;\
|
# pip install torch vllm && pip install auto-gptq https://github.com/jllllll/exllama/releases/download/0.0.10/exllama-0.0.10+cu${CUDA_MAJOR_VERSION}${CUDA_MINOR_VERSION}-cp39-cp39-linux_x86_64.whl;\
|
||||||
fi
|
# fi
|
||||||
RUN pip install -r /build/extra/requirements.txt && rm -rf /build/extra/requirements.txt
|
#RUN pip install -r /build/extra/requirements.txt && rm -rf /build/extra/requirements.txt
|
||||||
|
|
||||||
# Vall-e-X
|
# Vall-e-X
|
||||||
RUN git clone https://github.com/Plachtaa/VALL-E-X.git /usr/lib/vall-e-x && cd /usr/lib/vall-e-x && pip install -r requirements.txt
|
RUN git clone https://github.com/Plachtaa/VALL-E-X.git /usr/lib/vall-e-x && cd /usr/lib/vall-e-x && pip install -r requirements.txt
|
||||||
@ -139,6 +147,7 @@ FROM requirements-${IMAGE_TYPE}
|
|||||||
ARG FFMPEG
|
ARG FFMPEG
|
||||||
ARG BUILD_TYPE
|
ARG BUILD_TYPE
|
||||||
ARG TARGETARCH
|
ARG TARGETARCH
|
||||||
|
ARG IMAGE_TYPE=extras
|
||||||
|
|
||||||
ENV BUILD_TYPE=${BUILD_TYPE}
|
ENV BUILD_TYPE=${BUILD_TYPE}
|
||||||
ENV REBUILD=false
|
ENV REBUILD=false
|
||||||
@ -169,6 +178,10 @@ COPY --from=builder /build/local-ai ./
|
|||||||
# do not let stablediffusion rebuild (requires an older version of absl)
|
# do not let stablediffusion rebuild (requires an older version of absl)
|
||||||
COPY --from=builder /build/backend-assets/grpc/stablediffusion ./backend-assets/grpc/stablediffusion
|
COPY --from=builder /build/backend-assets/grpc/stablediffusion ./backend-assets/grpc/stablediffusion
|
||||||
|
|
||||||
|
RUN if [ "${IMAGE_TYPE}" = "extras" ]; then \
|
||||||
|
PATH=$PATH:/opt/conda/bin make prepare-extra-conda-environments \
|
||||||
|
; fi
|
||||||
|
|
||||||
# Copy VALLE-X as it's not a real "lib"
|
# Copy VALLE-X as it's not a real "lib"
|
||||||
RUN if [ -d /usr/lib/vall-e-x ]; then \
|
RUN if [ -d /usr/lib/vall-e-x ]; then \
|
||||||
cp -rfv /usr/lib/vall-e-x/* ./ ; \
|
cp -rfv /usr/lib/vall-e-x/* ./ ; \
|
||||||
|
26
Makefile
26
Makefile
@ -290,12 +290,12 @@ run: prepare ## run local-ai
|
|||||||
test-models/testmodel:
|
test-models/testmodel:
|
||||||
mkdir test-models
|
mkdir test-models
|
||||||
mkdir test-dir
|
mkdir test-dir
|
||||||
wget https://huggingface.co/nnakasato/ggml-model-test/resolve/main/ggml-model-q4.bin -O test-models/testmodel
|
wget -q https://huggingface.co/nnakasato/ggml-model-test/resolve/main/ggml-model-q4.bin -O test-models/testmodel
|
||||||
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin -O test-models/whisper-en
|
wget -q https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin -O test-models/whisper-en
|
||||||
wget https://huggingface.co/mudler/all-MiniLM-L6-v2/resolve/main/ggml-model-q4_0.bin -O test-models/bert
|
wget -q https://huggingface.co/mudler/all-MiniLM-L6-v2/resolve/main/ggml-model-q4_0.bin -O test-models/bert
|
||||||
wget https://cdn.openai.com/whisper/draft-20220913a/micro-machines.wav -O test-dir/audio.wav
|
wget -q https://cdn.openai.com/whisper/draft-20220913a/micro-machines.wav -O test-dir/audio.wav
|
||||||
wget https://huggingface.co/mudler/rwkv-4-raven-1.5B-ggml/resolve/main/RWKV-4-Raven-1B5-v11-Eng99%2525-Other1%2525-20230425-ctx4096_Q4_0.bin -O test-models/rwkv
|
wget -q https://huggingface.co/mudler/rwkv-4-raven-1.5B-ggml/resolve/main/RWKV-4-Raven-1B5-v11-Eng99%2525-Other1%2525-20230425-ctx4096_Q4_0.bin -O test-models/rwkv
|
||||||
wget https://raw.githubusercontent.com/saharNooby/rwkv.cpp/5eb8f09c146ea8124633ab041d9ea0b1f1db4459/rwkv/20B_tokenizer.json -O test-models/rwkv.tokenizer.json
|
wget -q https://raw.githubusercontent.com/saharNooby/rwkv.cpp/5eb8f09c146ea8124633ab041d9ea0b1f1db4459/rwkv/20B_tokenizer.json -O test-models/rwkv.tokenizer.json
|
||||||
cp tests/models_fixtures/* test-models
|
cp tests/models_fixtures/* test-models
|
||||||
|
|
||||||
prepare-test: grpcs
|
prepare-test: grpcs
|
||||||
@ -306,8 +306,8 @@ test: prepare test-models/testmodel grpcs
|
|||||||
@echo 'Running tests'
|
@echo 'Running tests'
|
||||||
export GO_TAGS="tts stablediffusion"
|
export GO_TAGS="tts stablediffusion"
|
||||||
$(MAKE) prepare-test
|
$(MAKE) prepare-test
|
||||||
HUGGINGFACE_GRPC=$(abspath ./)/extra/grpc/huggingface/huggingface.py TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \
|
HUGGINGFACE_GRPC=$(abspath ./)/extra/grpc/huggingface/run.sh TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \
|
||||||
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="!gpt4all && !llama && !llama-gguf" --flake-attempts 5 -v -r ./api ./pkg
|
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="!gpt4all && !llama && !llama-gguf" --fail-fast -v -r ./api ./pkg
|
||||||
$(MAKE) test-gpt4all
|
$(MAKE) test-gpt4all
|
||||||
$(MAKE) test-llama
|
$(MAKE) test-llama
|
||||||
$(MAKE) test-llama-gguf
|
$(MAKE) test-llama-gguf
|
||||||
@ -387,6 +387,16 @@ protogen-python:
|
|||||||
|
|
||||||
## GRPC
|
## GRPC
|
||||||
|
|
||||||
|
prepare-extra-conda-environments:
|
||||||
|
$(MAKE) -C extra/grpc/autogptq
|
||||||
|
$(MAKE) -C extra/grpc/bark
|
||||||
|
$(MAKE) -C extra/grpc/diffusers
|
||||||
|
$(MAKE) -C extra/grpc/vllm
|
||||||
|
$(MAKE) -C extra/grpc/huggingface
|
||||||
|
$(MAKE) -C extra/grpc/vall-e-x
|
||||||
|
$(MAKE) -C extra/grpc/exllama
|
||||||
|
|
||||||
|
|
||||||
backend-assets/grpc:
|
backend-assets/grpc:
|
||||||
mkdir -p backend-assets/grpc
|
mkdir -p backend-assets/grpc
|
||||||
|
|
||||||
|
38
extra/grpc/README.md
Normal file
38
extra/grpc/README.md
Normal file
@ -0,0 +1,38 @@
|
|||||||
|
# Common commands about conda environment
|
||||||
|
|
||||||
|
## Create a new empty conda environment
|
||||||
|
|
||||||
|
```
|
||||||
|
conda create --name <env-name> python=<your version> -y
|
||||||
|
|
||||||
|
conda create --name autogptq python=3.11 -y
|
||||||
|
```
|
||||||
|
|
||||||
|
## To activate the environment
|
||||||
|
|
||||||
|
As of conda 4.4
|
||||||
|
```
|
||||||
|
conda activate autogptq
|
||||||
|
```
|
||||||
|
|
||||||
|
The conda version older than 4.4
|
||||||
|
|
||||||
|
```
|
||||||
|
source activate autogptq
|
||||||
|
```
|
||||||
|
|
||||||
|
## Install the packages to your environment
|
||||||
|
|
||||||
|
Sometimes you need to install the packages from the conda-forge channel
|
||||||
|
|
||||||
|
By using `conda`
|
||||||
|
```
|
||||||
|
conda install <your-package-name>
|
||||||
|
|
||||||
|
conda install -c conda-forge <your package-name>
|
||||||
|
```
|
||||||
|
|
||||||
|
Or by using `pip`
|
||||||
|
```
|
||||||
|
pip install <your-package-name>
|
||||||
|
```
|
5
extra/grpc/autogptq/Makefile
Normal file
5
extra/grpc/autogptq/Makefile
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
.PONY: autogptq
|
||||||
|
autogptq:
|
||||||
|
@echo "Creating virtual environment..."
|
||||||
|
@conda env create --name autogptq --file autogptq.yml
|
||||||
|
@echo "Virtual environment created."
|
5
extra/grpc/autogptq/README.md
Normal file
5
extra/grpc/autogptq/README.md
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
# Creating a separate environment for the autogptq project
|
||||||
|
|
||||||
|
```
|
||||||
|
make autogptq
|
||||||
|
```
|
@ -1,15 +1,15 @@
|
|||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
import grpc
|
|
||||||
from concurrent import futures
|
from concurrent import futures
|
||||||
import time
|
|
||||||
import backend_pb2
|
|
||||||
import backend_pb2_grpc
|
|
||||||
import argparse
|
import argparse
|
||||||
import signal
|
import signal
|
||||||
import sys
|
import sys
|
||||||
import os
|
import os
|
||||||
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
|
import time
|
||||||
from pathlib import Path
|
|
||||||
|
import grpc
|
||||||
|
import backend_pb2
|
||||||
|
import backend_pb2_grpc
|
||||||
|
from auto_gptq import AutoGPTQForCausalLM
|
||||||
from transformers import AutoTokenizer
|
from transformers import AutoTokenizer
|
||||||
from transformers import TextGenerationPipeline
|
from transformers import TextGenerationPipeline
|
||||||
|
|
||||||
|
86
extra/grpc/autogptq/autogptq.yml
Normal file
86
extra/grpc/autogptq/autogptq.yml
Normal file
@ -0,0 +1,86 @@
|
|||||||
|
name: autogptq
|
||||||
|
channels:
|
||||||
|
- defaults
|
||||||
|
dependencies:
|
||||||
|
- _libgcc_mutex=0.1=main
|
||||||
|
- _openmp_mutex=5.1=1_gnu
|
||||||
|
- bzip2=1.0.8=h7b6447c_0
|
||||||
|
- ca-certificates=2023.08.22=h06a4308_0
|
||||||
|
- ld_impl_linux-64=2.38=h1181459_1
|
||||||
|
- libffi=3.4.4=h6a678d5_0
|
||||||
|
- libgcc-ng=11.2.0=h1234567_1
|
||||||
|
- libgomp=11.2.0=h1234567_1
|
||||||
|
- libstdcxx-ng=11.2.0=h1234567_1
|
||||||
|
- libuuid=1.41.5=h5eee18b_0
|
||||||
|
- ncurses=6.4=h6a678d5_0
|
||||||
|
- openssl=3.0.11=h7f8727e_2
|
||||||
|
- pip=23.2.1=py311h06a4308_0
|
||||||
|
- python=3.11.5=h955ad1f_0
|
||||||
|
- readline=8.2=h5eee18b_0
|
||||||
|
- setuptools=68.0.0=py311h06a4308_0
|
||||||
|
- sqlite=3.41.2=h5eee18b_0
|
||||||
|
- tk=8.6.12=h1ccaba5_0
|
||||||
|
- wheel=0.41.2=py311h06a4308_0
|
||||||
|
- xz=5.4.2=h5eee18b_0
|
||||||
|
- zlib=1.2.13=h5eee18b_0
|
||||||
|
- pip:
|
||||||
|
- accelerate==0.23.0
|
||||||
|
- aiohttp==3.8.5
|
||||||
|
- aiosignal==1.3.1
|
||||||
|
- async-timeout==4.0.3
|
||||||
|
- attrs==23.1.0
|
||||||
|
- auto-gptq==0.4.2
|
||||||
|
- certifi==2023.7.22
|
||||||
|
- charset-normalizer==3.3.0
|
||||||
|
- datasets==2.14.5
|
||||||
|
- dill==0.3.7
|
||||||
|
- filelock==3.12.4
|
||||||
|
- frozenlist==1.4.0
|
||||||
|
- fsspec==2023.6.0
|
||||||
|
- grpcio==1.59.0
|
||||||
|
- huggingface-hub==0.16.4
|
||||||
|
- idna==3.4
|
||||||
|
- jinja2==3.1.2
|
||||||
|
- markupsafe==2.1.3
|
||||||
|
- mpmath==1.3.0
|
||||||
|
- multidict==6.0.4
|
||||||
|
- multiprocess==0.70.15
|
||||||
|
- networkx==3.1
|
||||||
|
- numpy==1.26.0
|
||||||
|
- nvidia-cublas-cu12==12.1.3.1
|
||||||
|
- nvidia-cuda-cupti-cu12==12.1.105
|
||||||
|
- nvidia-cuda-nvrtc-cu12==12.1.105
|
||||||
|
- nvidia-cuda-runtime-cu12==12.1.105
|
||||||
|
- nvidia-cudnn-cu12==8.9.2.26
|
||||||
|
- nvidia-cufft-cu12==11.0.2.54
|
||||||
|
- nvidia-curand-cu12==10.3.2.106
|
||||||
|
- nvidia-cusolver-cu12==11.4.5.107
|
||||||
|
- nvidia-cusparse-cu12==12.1.0.106
|
||||||
|
- nvidia-nccl-cu12==2.18.1
|
||||||
|
- nvidia-nvjitlink-cu12==12.2.140
|
||||||
|
- nvidia-nvtx-cu12==12.1.105
|
||||||
|
- packaging==23.2
|
||||||
|
- pandas==2.1.1
|
||||||
|
- peft==0.5.0
|
||||||
|
- protobuf==4.24.4
|
||||||
|
- psutil==5.9.5
|
||||||
|
- pyarrow==13.0.0
|
||||||
|
- python-dateutil==2.8.2
|
||||||
|
- pytz==2023.3.post1
|
||||||
|
- pyyaml==6.0.1
|
||||||
|
- regex==2023.10.3
|
||||||
|
- requests==2.31.0
|
||||||
|
- rouge==1.0.1
|
||||||
|
- safetensors==0.3.3
|
||||||
|
- six==1.16.0
|
||||||
|
- sympy==1.12
|
||||||
|
- tokenizers==0.14.0
|
||||||
|
- torch==2.1.0
|
||||||
|
- tqdm==4.66.1
|
||||||
|
- transformers==4.34.0
|
||||||
|
- triton==2.1.0
|
||||||
|
- typing-extensions==4.8.0
|
||||||
|
- tzdata==2023.3
|
||||||
|
- urllib3==2.0.6
|
||||||
|
- xxhash==3.4.1
|
||||||
|
- yarl==1.9.2
|
14
extra/grpc/autogptq/run.sh
Executable file
14
extra/grpc/autogptq/run.sh
Executable file
@ -0,0 +1,14 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
##
|
||||||
|
## A bash script wrapper that runs the autogptq server with conda
|
||||||
|
|
||||||
|
export PATH=$PATH:/opt/conda/bin
|
||||||
|
|
||||||
|
# Activate conda environment
|
||||||
|
source activate autogptq
|
||||||
|
|
||||||
|
# get the directory where the bash script is located
|
||||||
|
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
|
||||||
|
|
||||||
|
python $DIR/autogptq.py $@
|
5
extra/grpc/bark/Makefile
Normal file
5
extra/grpc/bark/Makefile
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
.PONY: ttsbark
|
||||||
|
ttsbark:
|
||||||
|
@echo "Creating virtual environment..."
|
||||||
|
@conda env create --name ttsbark --file ttsbark.yml
|
||||||
|
@echo "Virtual environment created."
|
16
extra/grpc/bark/README.md
Normal file
16
extra/grpc/bark/README.md
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
# Creating a separate environment for ttsbark project
|
||||||
|
|
||||||
|
```
|
||||||
|
make ttsbark
|
||||||
|
```
|
||||||
|
|
||||||
|
# Testing the gRPC server
|
||||||
|
|
||||||
|
```
|
||||||
|
<The path of your python interpreter> -m unittest test_ttsbark.py
|
||||||
|
```
|
||||||
|
|
||||||
|
For example
|
||||||
|
```
|
||||||
|
/opt/conda/envs/bark/bin/python -m unittest extra/grpc/bark/test_ttsbark.py
|
||||||
|
``````
|
14
extra/grpc/bark/run.sh
Executable file
14
extra/grpc/bark/run.sh
Executable file
@ -0,0 +1,14 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
##
|
||||||
|
## A bash script wrapper that runs the ttsbark server with conda
|
||||||
|
|
||||||
|
export PATH=$PATH:/opt/conda/bin
|
||||||
|
|
||||||
|
# Activate conda environment
|
||||||
|
source activate ttsbark
|
||||||
|
|
||||||
|
# get the directory where the bash script is located
|
||||||
|
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
|
||||||
|
|
||||||
|
python $DIR/ttsbark.py $@
|
32
extra/grpc/bark/test_ttsbark.py
Normal file
32
extra/grpc/bark/test_ttsbark.py
Normal file
@ -0,0 +1,32 @@
|
|||||||
|
import unittest
|
||||||
|
import subprocess
|
||||||
|
import time
|
||||||
|
import backend_pb2
|
||||||
|
import backend_pb2_grpc
|
||||||
|
|
||||||
|
import grpc
|
||||||
|
|
||||||
|
class TestBackendServicer(unittest.TestCase):
|
||||||
|
"""
|
||||||
|
TestBackendServicer is the class that tests the gRPC service
|
||||||
|
"""
|
||||||
|
def setUp(self):
|
||||||
|
self.service = subprocess.Popen(["python3", "ttsbark.py", "--addr", "localhost:50051"])
|
||||||
|
|
||||||
|
def tearDown(self) -> None:
|
||||||
|
self.service.terminate()
|
||||||
|
self.service.wait()
|
||||||
|
|
||||||
|
def test_server_startup(self):
|
||||||
|
time.sleep(2)
|
||||||
|
try:
|
||||||
|
self.setUp()
|
||||||
|
with grpc.insecure_channel("localhost:50051") as channel:
|
||||||
|
stub = backend_pb2_grpc.BackendStub(channel)
|
||||||
|
response = stub.Health(backend_pb2.HealthMessage())
|
||||||
|
self.assertEqual(response.message, b'OK')
|
||||||
|
except Exception as err:
|
||||||
|
print(err)
|
||||||
|
self.fail("Server failed to start")
|
||||||
|
finally:
|
||||||
|
self.tearDown()
|
@ -1,18 +1,23 @@
|
|||||||
|
"""
|
||||||
|
This is the extra gRPC server of LocalAI
|
||||||
|
"""
|
||||||
|
|
||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
import grpc
|
|
||||||
from concurrent import futures
|
from concurrent import futures
|
||||||
import time
|
import time
|
||||||
import backend_pb2
|
|
||||||
import backend_pb2_grpc
|
|
||||||
import argparse
|
import argparse
|
||||||
import signal
|
import signal
|
||||||
import sys
|
import sys
|
||||||
import os
|
import os
|
||||||
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
|
|
||||||
from pathlib import Path
|
|
||||||
from bark import SAMPLE_RATE, generate_audio, preload_models
|
|
||||||
from scipy.io.wavfile import write as write_wav
|
from scipy.io.wavfile import write as write_wav
|
||||||
|
|
||||||
|
import backend_pb2
|
||||||
|
import backend_pb2_grpc
|
||||||
|
from bark import SAMPLE_RATE, generate_audio, preload_models
|
||||||
|
|
||||||
|
import grpc
|
||||||
|
|
||||||
|
|
||||||
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
|
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
|
||||||
|
|
||||||
# If MAX_WORKERS are specified in the environment use it, otherwise default to 1
|
# If MAX_WORKERS are specified in the environment use it, otherwise default to 1
|
||||||
@ -20,6 +25,9 @@ MAX_WORKERS = int(os.environ.get('PYTHON_GRPC_MAX_WORKERS', '1'))
|
|||||||
|
|
||||||
# Implement the BackendServicer class with the service methods
|
# Implement the BackendServicer class with the service methods
|
||||||
class BackendServicer(backend_pb2_grpc.BackendServicer):
|
class BackendServicer(backend_pb2_grpc.BackendServicer):
|
||||||
|
"""
|
||||||
|
BackendServicer is the class that implements the gRPC service
|
||||||
|
"""
|
||||||
def Health(self, request, context):
|
def Health(self, request, context):
|
||||||
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
|
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
|
||||||
def LoadModel(self, request, context):
|
def LoadModel(self, request, context):
|
||||||
|
96
extra/grpc/bark/ttsbark.yml
Normal file
96
extra/grpc/bark/ttsbark.yml
Normal file
@ -0,0 +1,96 @@
|
|||||||
|
name: bark
|
||||||
|
channels:
|
||||||
|
- defaults
|
||||||
|
dependencies:
|
||||||
|
- _libgcc_mutex=0.1=main
|
||||||
|
- _openmp_mutex=5.1=1_gnu
|
||||||
|
- bzip2=1.0.8=h7b6447c_0
|
||||||
|
- ca-certificates=2023.08.22=h06a4308_0
|
||||||
|
- ld_impl_linux-64=2.38=h1181459_1
|
||||||
|
- libffi=3.4.4=h6a678d5_0
|
||||||
|
- libgcc-ng=11.2.0=h1234567_1
|
||||||
|
- libgomp=11.2.0=h1234567_1
|
||||||
|
- libstdcxx-ng=11.2.0=h1234567_1
|
||||||
|
- libuuid=1.41.5=h5eee18b_0
|
||||||
|
- ncurses=6.4=h6a678d5_0
|
||||||
|
- openssl=3.0.11=h7f8727e_2
|
||||||
|
- pip=23.2.1=py311h06a4308_0
|
||||||
|
- python=3.11.5=h955ad1f_0
|
||||||
|
- readline=8.2=h5eee18b_0
|
||||||
|
- setuptools=68.0.0=py311h06a4308_0
|
||||||
|
- sqlite=3.41.2=h5eee18b_0
|
||||||
|
- tk=8.6.12=h1ccaba5_0
|
||||||
|
- wheel=0.41.2=py311h06a4308_0
|
||||||
|
- xz=5.4.2=h5eee18b_0
|
||||||
|
- zlib=1.2.13=h5eee18b_0
|
||||||
|
- pip:
|
||||||
|
- accelerate==0.23.0
|
||||||
|
- aiohttp==3.8.5
|
||||||
|
- aiosignal==1.3.1
|
||||||
|
- async-timeout==4.0.3
|
||||||
|
- attrs==23.1.0
|
||||||
|
- bark==0.1.5
|
||||||
|
- boto3==1.28.61
|
||||||
|
- botocore==1.31.61
|
||||||
|
- certifi==2023.7.22
|
||||||
|
- charset-normalizer==3.3.0
|
||||||
|
- datasets==2.14.5
|
||||||
|
- dill==0.3.7
|
||||||
|
- einops==0.7.0
|
||||||
|
- encodec==0.1.1
|
||||||
|
- filelock==3.12.4
|
||||||
|
- frozenlist==1.4.0
|
||||||
|
- fsspec==2023.6.0
|
||||||
|
- funcy==2.0
|
||||||
|
- grpcio==1.59.0
|
||||||
|
- huggingface-hub==0.16.4
|
||||||
|
- idna==3.4
|
||||||
|
- jinja2==3.1.2
|
||||||
|
- jmespath==1.0.1
|
||||||
|
- markupsafe==2.1.3
|
||||||
|
- mpmath==1.3.0
|
||||||
|
- multidict==6.0.4
|
||||||
|
- multiprocess==0.70.15
|
||||||
|
- networkx==3.1
|
||||||
|
- numpy==1.26.0
|
||||||
|
- nvidia-cublas-cu12==12.1.3.1
|
||||||
|
- nvidia-cuda-cupti-cu12==12.1.105
|
||||||
|
- nvidia-cuda-nvrtc-cu12==12.1.105
|
||||||
|
- nvidia-cuda-runtime-cu12==12.1.105
|
||||||
|
- nvidia-cudnn-cu12==8.9.2.26
|
||||||
|
- nvidia-cufft-cu12==11.0.2.54
|
||||||
|
- nvidia-curand-cu12==10.3.2.106
|
||||||
|
- nvidia-cusolver-cu12==11.4.5.107
|
||||||
|
- nvidia-cusparse-cu12==12.1.0.106
|
||||||
|
- nvidia-nccl-cu12==2.18.1
|
||||||
|
- nvidia-nvjitlink-cu12==12.2.140
|
||||||
|
- nvidia-nvtx-cu12==12.1.105
|
||||||
|
- packaging==23.2
|
||||||
|
- pandas==2.1.1
|
||||||
|
- peft==0.5.0
|
||||||
|
- protobuf==4.24.4
|
||||||
|
- psutil==5.9.5
|
||||||
|
- pyarrow==13.0.0
|
||||||
|
- python-dateutil==2.8.2
|
||||||
|
- pytz==2023.3.post1
|
||||||
|
- pyyaml==6.0.1
|
||||||
|
- regex==2023.10.3
|
||||||
|
- requests==2.31.0
|
||||||
|
- rouge==1.0.1
|
||||||
|
- s3transfer==0.7.0
|
||||||
|
- safetensors==0.3.3
|
||||||
|
- scipy==1.11.3
|
||||||
|
- six==1.16.0
|
||||||
|
- sympy==1.12
|
||||||
|
- tokenizers==0.14.0
|
||||||
|
- torch==2.1.0
|
||||||
|
- torchaudio==2.1.0
|
||||||
|
- tqdm==4.66.1
|
||||||
|
- transformers==4.34.0
|
||||||
|
- triton==2.1.0
|
||||||
|
- typing-extensions==4.8.0
|
||||||
|
- tzdata==2023.3
|
||||||
|
- urllib3==1.26.17
|
||||||
|
- xxhash==3.4.1
|
||||||
|
- yarl==1.9.2
|
||||||
|
prefix: /opt/conda/envs/bark
|
11
extra/grpc/diffusers/Makefile
Normal file
11
extra/grpc/diffusers/Makefile
Normal file
@ -0,0 +1,11 @@
|
|||||||
|
.PONY: diffusers
|
||||||
|
diffusers:
|
||||||
|
@echo "Creating virtual environment..."
|
||||||
|
@conda env create --name diffusers --file diffusers.yml
|
||||||
|
@echo "Virtual environment created."
|
||||||
|
|
||||||
|
.PONY: run
|
||||||
|
run:
|
||||||
|
@echo "Running diffusers..."
|
||||||
|
bash run.sh
|
||||||
|
@echo "Diffusers run."
|
5
extra/grpc/diffusers/README.md
Normal file
5
extra/grpc/diffusers/README.md
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
# Creating a separate environment for the diffusers project
|
||||||
|
|
||||||
|
```
|
||||||
|
make diffusers
|
||||||
|
```
|
@ -1,27 +1,32 @@
|
|||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
import grpc
|
|
||||||
from concurrent import futures
|
from concurrent import futures
|
||||||
import time
|
|
||||||
import backend_pb2
|
|
||||||
import backend_pb2_grpc
|
|
||||||
import argparse
|
import argparse
|
||||||
|
from collections import defaultdict
|
||||||
|
from enum import Enum
|
||||||
import signal
|
import signal
|
||||||
import sys
|
import sys
|
||||||
|
import time
|
||||||
import os
|
import os
|
||||||
|
|
||||||
# import diffusers
|
|
||||||
import torch
|
|
||||||
from torch import autocast
|
|
||||||
from diffusers import StableDiffusionXLPipeline, StableDiffusionDepth2ImgPipeline, DPMSolverMultistepScheduler, StableDiffusionPipeline, DiffusionPipeline, EulerAncestralDiscreteScheduler
|
|
||||||
from diffusers.pipelines.stable_diffusion import safety_checker
|
|
||||||
from compel import Compel
|
|
||||||
from PIL import Image
|
from PIL import Image
|
||||||
from io import BytesIO
|
import torch
|
||||||
|
|
||||||
|
import backend_pb2
|
||||||
|
import backend_pb2_grpc
|
||||||
|
|
||||||
|
import grpc
|
||||||
|
|
||||||
|
from diffusers import StableDiffusionXLPipeline, StableDiffusionDepth2ImgPipeline, DPMSolverMultistepScheduler, StableDiffusionPipeline, DiffusionPipeline, EulerAncestralDiscreteScheduler
|
||||||
from diffusers import StableDiffusionImg2ImgPipeline
|
from diffusers import StableDiffusionImg2ImgPipeline
|
||||||
|
from diffusers.pipelines.stable_diffusion import safety_checker
|
||||||
|
|
||||||
|
from compel import Compel
|
||||||
|
|
||||||
from transformers import CLIPTextModel
|
from transformers import CLIPTextModel
|
||||||
from enum import Enum
|
|
||||||
from collections import defaultdict
|
|
||||||
from safetensors.torch import load_file
|
from safetensors.torch import load_file
|
||||||
|
|
||||||
|
|
||||||
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
|
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
|
||||||
COMPEL=os.environ.get("COMPEL", "1") == "1"
|
COMPEL=os.environ.get("COMPEL", "1") == "1"
|
||||||
CLIPSKIP=os.environ.get("CLIPSKIP", "1") == "1"
|
CLIPSKIP=os.environ.get("CLIPSKIP", "1") == "1"
|
||||||
|
74
extra/grpc/diffusers/diffusers.yml
Normal file
74
extra/grpc/diffusers/diffusers.yml
Normal file
@ -0,0 +1,74 @@
|
|||||||
|
name: diffusers
|
||||||
|
channels:
|
||||||
|
- defaults
|
||||||
|
dependencies:
|
||||||
|
- _libgcc_mutex=0.1=main
|
||||||
|
- _openmp_mutex=5.1=1_gnu
|
||||||
|
- bzip2=1.0.8=h7b6447c_0
|
||||||
|
- ca-certificates=2023.08.22=h06a4308_0
|
||||||
|
- ld_impl_linux-64=2.38=h1181459_1
|
||||||
|
- libffi=3.4.4=h6a678d5_0
|
||||||
|
- libgcc-ng=11.2.0=h1234567_1
|
||||||
|
- libgomp=11.2.0=h1234567_1
|
||||||
|
- libstdcxx-ng=11.2.0=h1234567_1
|
||||||
|
- libuuid=1.41.5=h5eee18b_0
|
||||||
|
- ncurses=6.4=h6a678d5_0
|
||||||
|
- openssl=3.0.11=h7f8727e_2
|
||||||
|
- pip=23.2.1=py311h06a4308_0
|
||||||
|
- python=3.11.5=h955ad1f_0
|
||||||
|
- readline=8.2=h5eee18b_0
|
||||||
|
- setuptools=68.0.0=py311h06a4308_0
|
||||||
|
- sqlite=3.41.2=h5eee18b_0
|
||||||
|
- tk=8.6.12=h1ccaba5_0
|
||||||
|
- tzdata=2023c=h04d1e81_0
|
||||||
|
- wheel=0.41.2=py311h06a4308_0
|
||||||
|
- xz=5.4.2=h5eee18b_0
|
||||||
|
- zlib=1.2.13=h5eee18b_0
|
||||||
|
- pip:
|
||||||
|
- accelerate==0.23.0
|
||||||
|
- certifi==2023.7.22
|
||||||
|
- charset-normalizer==3.3.0
|
||||||
|
- compel==2.0.2
|
||||||
|
- diffusers==0.21.4
|
||||||
|
- filelock==3.12.4
|
||||||
|
- fsspec==2023.9.2
|
||||||
|
- grpcio==1.59.0
|
||||||
|
- huggingface-hub==0.17.3
|
||||||
|
- idna==3.4
|
||||||
|
- importlib-metadata==6.8.0
|
||||||
|
- jinja2==3.1.2
|
||||||
|
- markupsafe==2.1.3
|
||||||
|
- mpmath==1.3.0
|
||||||
|
- networkx==3.1
|
||||||
|
- numpy==1.26.0
|
||||||
|
- nvidia-cublas-cu12==12.1.3.1
|
||||||
|
- nvidia-cuda-cupti-cu12==12.1.105
|
||||||
|
- nvidia-cuda-nvrtc-cu12==12.1.105
|
||||||
|
- nvidia-cuda-runtime-cu12==12.1.105
|
||||||
|
- nvidia-cudnn-cu12==8.9.2.26
|
||||||
|
- nvidia-cufft-cu12==11.0.2.54
|
||||||
|
- nvidia-curand-cu12==10.3.2.106
|
||||||
|
- nvidia-cusolver-cu12==11.4.5.107
|
||||||
|
- nvidia-cusparse-cu12==12.1.0.106
|
||||||
|
- nvidia-nccl-cu12==2.18.1
|
||||||
|
- nvidia-nvjitlink-cu12==12.2.140
|
||||||
|
- nvidia-nvtx-cu12==12.1.105
|
||||||
|
- packaging==23.2
|
||||||
|
- pillow==10.0.1
|
||||||
|
- protobuf==4.24.4
|
||||||
|
- psutil==5.9.5
|
||||||
|
- pyparsing==3.1.1
|
||||||
|
- pyyaml==6.0.1
|
||||||
|
- regex==2023.10.3
|
||||||
|
- requests==2.31.0
|
||||||
|
- safetensors==0.4.0
|
||||||
|
- sympy==1.12
|
||||||
|
- tokenizers==0.14.1
|
||||||
|
- torch==2.1.0
|
||||||
|
- tqdm==4.66.1
|
||||||
|
- transformers==4.34.0
|
||||||
|
- triton==2.1.0
|
||||||
|
- typing-extensions==4.8.0
|
||||||
|
- urllib3==2.0.6
|
||||||
|
- zipp==3.17.0
|
||||||
|
prefix: /opt/conda/envs/diffusers
|
14
extra/grpc/diffusers/run.sh
Executable file
14
extra/grpc/diffusers/run.sh
Executable file
@ -0,0 +1,14 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
##
|
||||||
|
## A bash script wrapper that runs the diffusers server with conda
|
||||||
|
|
||||||
|
export PATH=$PATH:/opt/conda/bin
|
||||||
|
|
||||||
|
# Activate conda environment
|
||||||
|
source activate diffusers
|
||||||
|
|
||||||
|
# get the directory where the bash script is located
|
||||||
|
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
|
||||||
|
|
||||||
|
python $DIR/backend_diffusers.py $@
|
11
extra/grpc/exllama/Makefile
Normal file
11
extra/grpc/exllama/Makefile
Normal file
@ -0,0 +1,11 @@
|
|||||||
|
.PONY: exllama
|
||||||
|
exllama:
|
||||||
|
@echo "Creating virtual environment..."
|
||||||
|
@conda env create --name exllama --file exllama.yml
|
||||||
|
@echo "Virtual environment created."
|
||||||
|
|
||||||
|
.PONY: run
|
||||||
|
run:
|
||||||
|
@echo "Running exllama..."
|
||||||
|
bash run.sh
|
||||||
|
@echo "exllama run."
|
5
extra/grpc/exllama/README.md
Normal file
5
extra/grpc/exllama/README.md
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
# Creating a separate environment for the exllama project
|
||||||
|
|
||||||
|
```
|
||||||
|
make exllama
|
||||||
|
```
|
55
extra/grpc/exllama/exllama.yml
Normal file
55
extra/grpc/exllama/exllama.yml
Normal file
@ -0,0 +1,55 @@
|
|||||||
|
name: exllama
|
||||||
|
channels:
|
||||||
|
- defaults
|
||||||
|
dependencies:
|
||||||
|
- _libgcc_mutex=0.1=main
|
||||||
|
- _openmp_mutex=5.1=1_gnu
|
||||||
|
- bzip2=1.0.8=h7b6447c_0
|
||||||
|
- ca-certificates=2023.08.22=h06a4308_0
|
||||||
|
- ld_impl_linux-64=2.38=h1181459_1
|
||||||
|
- libffi=3.4.4=h6a678d5_0
|
||||||
|
- libgcc-ng=11.2.0=h1234567_1
|
||||||
|
- libgomp=11.2.0=h1234567_1
|
||||||
|
- libstdcxx-ng=11.2.0=h1234567_1
|
||||||
|
- libuuid=1.41.5=h5eee18b_0
|
||||||
|
- ncurses=6.4=h6a678d5_0
|
||||||
|
- openssl=3.0.11=h7f8727e_2
|
||||||
|
- pip=23.2.1=py311h06a4308_0
|
||||||
|
- python=3.11.5=h955ad1f_0
|
||||||
|
- readline=8.2=h5eee18b_0
|
||||||
|
- setuptools=68.0.0=py311h06a4308_0
|
||||||
|
- sqlite=3.41.2=h5eee18b_0
|
||||||
|
- tk=8.6.12=h1ccaba5_0
|
||||||
|
- tzdata=2023c=h04d1e81_0
|
||||||
|
- wheel=0.41.2=py311h06a4308_0
|
||||||
|
- xz=5.4.2=h5eee18b_0
|
||||||
|
- zlib=1.2.13=h5eee18b_0
|
||||||
|
- pip:
|
||||||
|
- filelock==3.12.4
|
||||||
|
- fsspec==2023.9.2
|
||||||
|
- grpcio==1.59.0
|
||||||
|
- jinja2==3.1.2
|
||||||
|
- markupsafe==2.1.3
|
||||||
|
- mpmath==1.3.0
|
||||||
|
- networkx==3.1
|
||||||
|
- ninja==1.11.1
|
||||||
|
- nvidia-cublas-cu12==12.1.3.1
|
||||||
|
- nvidia-cuda-cupti-cu12==12.1.105
|
||||||
|
- nvidia-cuda-nvrtc-cu12==12.1.105
|
||||||
|
- nvidia-cuda-runtime-cu12==12.1.105
|
||||||
|
- nvidia-cudnn-cu12==8.9.2.26
|
||||||
|
- nvidia-cufft-cu12==11.0.2.54
|
||||||
|
- nvidia-curand-cu12==10.3.2.106
|
||||||
|
- nvidia-cusolver-cu12==11.4.5.107
|
||||||
|
- nvidia-cusparse-cu12==12.1.0.106
|
||||||
|
- nvidia-nccl-cu12==2.18.1
|
||||||
|
- nvidia-nvjitlink-cu12==12.2.140
|
||||||
|
- nvidia-nvtx-cu12==12.1.105
|
||||||
|
- protobuf==4.24.4
|
||||||
|
- safetensors==0.3.2
|
||||||
|
- sentencepiece==0.1.99
|
||||||
|
- sympy==1.12
|
||||||
|
- torch==2.1.0
|
||||||
|
- triton==2.1.0
|
||||||
|
- typing-extensions==4.8.0
|
||||||
|
prefix: /opt/conda/envs/exllama
|
14
extra/grpc/exllama/run.sh
Executable file
14
extra/grpc/exllama/run.sh
Executable file
@ -0,0 +1,14 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
##
|
||||||
|
## A bash script wrapper that runs the exllama server with conda
|
||||||
|
|
||||||
|
export PATH=$PATH:/opt/conda/bin
|
||||||
|
|
||||||
|
# Activate conda environment
|
||||||
|
source activate exllama
|
||||||
|
|
||||||
|
# get the directory where the bash script is located
|
||||||
|
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
|
||||||
|
|
||||||
|
python $DIR/exllama.py $@
|
18
extra/grpc/huggingface/Makefile
Normal file
18
extra/grpc/huggingface/Makefile
Normal file
@ -0,0 +1,18 @@
|
|||||||
|
.PONY: huggingface
|
||||||
|
huggingface:
|
||||||
|
@echo "Creating virtual environment..."
|
||||||
|
@conda env create --name huggingface --file huggingface.yml
|
||||||
|
@echo "Virtual environment created."
|
||||||
|
|
||||||
|
.PONY: run
|
||||||
|
run:
|
||||||
|
@echo "Running huggingface..."
|
||||||
|
bash run.sh
|
||||||
|
@echo "huggingface run."
|
||||||
|
|
||||||
|
# It is not working well by using command line. It only6 works with IDE like VSCode.
|
||||||
|
.PONY: test
|
||||||
|
test:
|
||||||
|
@echo "Testing huggingface..."
|
||||||
|
bash test.sh
|
||||||
|
@echo "huggingface tested."
|
5
extra/grpc/huggingface/README.md
Normal file
5
extra/grpc/huggingface/README.md
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
# Creating a separate environment for the huggingface project
|
||||||
|
|
||||||
|
```
|
||||||
|
make huggingface
|
||||||
|
```
|
@ -1,13 +1,20 @@
|
|||||||
|
"""
|
||||||
|
Extra gRPC server for HuggingFace SentenceTransformer models.
|
||||||
|
"""
|
||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
import grpc
|
|
||||||
from concurrent import futures
|
from concurrent import futures
|
||||||
import time
|
|
||||||
import backend_pb2
|
|
||||||
import backend_pb2_grpc
|
|
||||||
import argparse
|
import argparse
|
||||||
import signal
|
import signal
|
||||||
import sys
|
import sys
|
||||||
import os
|
import os
|
||||||
|
|
||||||
|
import time
|
||||||
|
import backend_pb2
|
||||||
|
import backend_pb2_grpc
|
||||||
|
|
||||||
|
import grpc
|
||||||
|
|
||||||
from sentence_transformers import SentenceTransformer
|
from sentence_transformers import SentenceTransformer
|
||||||
|
|
||||||
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
|
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
|
||||||
@ -17,18 +24,56 @@ MAX_WORKERS = int(os.environ.get('PYTHON_GRPC_MAX_WORKERS', '1'))
|
|||||||
|
|
||||||
# Implement the BackendServicer class with the service methods
|
# Implement the BackendServicer class with the service methods
|
||||||
class BackendServicer(backend_pb2_grpc.BackendServicer):
|
class BackendServicer(backend_pb2_grpc.BackendServicer):
|
||||||
|
"""
|
||||||
|
A gRPC servicer for the backend service.
|
||||||
|
|
||||||
|
This class implements the gRPC methods for the backend service, including Health, LoadModel, and Embedding.
|
||||||
|
"""
|
||||||
def Health(self, request, context):
|
def Health(self, request, context):
|
||||||
|
"""
|
||||||
|
A gRPC method that returns the health status of the backend service.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: A HealthRequest object that contains the request parameters.
|
||||||
|
context: A grpc.ServicerContext object that provides information about the RPC.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
A Reply object that contains the health status of the backend service.
|
||||||
|
"""
|
||||||
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
|
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
|
||||||
|
|
||||||
def LoadModel(self, request, context):
|
def LoadModel(self, request, context):
|
||||||
|
"""
|
||||||
|
A gRPC method that loads a model into memory.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: A LoadModelRequest object that contains the request parameters.
|
||||||
|
context: A grpc.ServicerContext object that provides information about the RPC.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
A Result object that contains the result of the LoadModel operation.
|
||||||
|
"""
|
||||||
model_name = request.Model
|
model_name = request.Model
|
||||||
try:
|
try:
|
||||||
self.model = SentenceTransformer(model_name)
|
self.model = SentenceTransformer(model_name)
|
||||||
except Exception as err:
|
except Exception as err:
|
||||||
return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}")
|
return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}")
|
||||||
|
|
||||||
# Implement your logic here for the LoadModel service
|
# Implement your logic here for the LoadModel service
|
||||||
# Replace this with your desired response
|
# Replace this with your desired response
|
||||||
return backend_pb2.Result(message="Model loaded successfully", success=True)
|
return backend_pb2.Result(message="Model loaded successfully", success=True)
|
||||||
|
|
||||||
def Embedding(self, request, context):
|
def Embedding(self, request, context):
|
||||||
|
"""
|
||||||
|
A gRPC method that calculates embeddings for a given sentence.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: An EmbeddingRequest object that contains the request parameters.
|
||||||
|
context: A grpc.ServicerContext object that provides information about the RPC.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
An EmbeddingResult object that contains the calculated embeddings.
|
||||||
|
"""
|
||||||
# Implement your logic here for the Embedding service
|
# Implement your logic here for the Embedding service
|
||||||
# Replace this with your desired response
|
# Replace this with your desired response
|
||||||
print("Calculated embeddings for: " + request.Embeddings, file=sys.stderr)
|
print("Calculated embeddings for: " + request.Embeddings, file=sys.stderr)
|
||||||
|
77
extra/grpc/huggingface/huggingface.yml
Normal file
77
extra/grpc/huggingface/huggingface.yml
Normal file
@ -0,0 +1,77 @@
|
|||||||
|
name: huggingface
|
||||||
|
channels:
|
||||||
|
- defaults
|
||||||
|
dependencies:
|
||||||
|
- _libgcc_mutex=0.1=main
|
||||||
|
- _openmp_mutex=5.1=1_gnu
|
||||||
|
- bzip2=1.0.8=h7b6447c_0
|
||||||
|
- ca-certificates=2023.08.22=h06a4308_0
|
||||||
|
- ld_impl_linux-64=2.38=h1181459_1
|
||||||
|
- libffi=3.4.4=h6a678d5_0
|
||||||
|
- libgcc-ng=11.2.0=h1234567_1
|
||||||
|
- libgomp=11.2.0=h1234567_1
|
||||||
|
- libstdcxx-ng=11.2.0=h1234567_1
|
||||||
|
- libuuid=1.41.5=h5eee18b_0
|
||||||
|
- ncurses=6.4=h6a678d5_0
|
||||||
|
- openssl=3.0.11=h7f8727e_2
|
||||||
|
- pip=23.2.1=py311h06a4308_0
|
||||||
|
- python=3.11.5=h955ad1f_0
|
||||||
|
- readline=8.2=h5eee18b_0
|
||||||
|
- setuptools=68.0.0=py311h06a4308_0
|
||||||
|
- sqlite=3.41.2=h5eee18b_0
|
||||||
|
- tk=8.6.12=h1ccaba5_0
|
||||||
|
- tzdata=2023c=h04d1e81_0
|
||||||
|
- wheel=0.41.2=py311h06a4308_0
|
||||||
|
- xz=5.4.2=h5eee18b_0
|
||||||
|
- zlib=1.2.13=h5eee18b_0
|
||||||
|
- pip:
|
||||||
|
- certifi==2023.7.22
|
||||||
|
- charset-normalizer==3.3.0
|
||||||
|
- click==8.1.7
|
||||||
|
- filelock==3.12.4
|
||||||
|
- fsspec==2023.9.2
|
||||||
|
- grpcio==1.59.0
|
||||||
|
- huggingface-hub==0.17.3
|
||||||
|
- idna==3.4
|
||||||
|
- install==1.3.5
|
||||||
|
- jinja2==3.1.2
|
||||||
|
- joblib==1.3.2
|
||||||
|
- markupsafe==2.1.3
|
||||||
|
- mpmath==1.3.0
|
||||||
|
- networkx==3.1
|
||||||
|
- nltk==3.8.1
|
||||||
|
- numpy==1.26.0
|
||||||
|
- nvidia-cublas-cu12==12.1.3.1
|
||||||
|
- nvidia-cuda-cupti-cu12==12.1.105
|
||||||
|
- nvidia-cuda-nvrtc-cu12==12.1.105
|
||||||
|
- nvidia-cuda-runtime-cu12==12.1.105
|
||||||
|
- nvidia-cudnn-cu12==8.9.2.26
|
||||||
|
- nvidia-cufft-cu12==11.0.2.54
|
||||||
|
- nvidia-curand-cu12==10.3.2.106
|
||||||
|
- nvidia-cusolver-cu12==11.4.5.107
|
||||||
|
- nvidia-cusparse-cu12==12.1.0.106
|
||||||
|
- nvidia-nccl-cu12==2.18.1
|
||||||
|
- nvidia-nvjitlink-cu12==12.2.140
|
||||||
|
- nvidia-nvtx-cu12==12.1.105
|
||||||
|
- packaging==23.2
|
||||||
|
- pillow==10.0.1
|
||||||
|
- protobuf==4.24.4
|
||||||
|
- pyyaml==6.0.1
|
||||||
|
- regex==2023.10.3
|
||||||
|
- requests==2.31.0
|
||||||
|
- safetensors==0.4.0
|
||||||
|
- scikit-learn==1.3.1
|
||||||
|
- scipy==1.11.3
|
||||||
|
- sentence-transformers==2.2.2
|
||||||
|
- sentencepiece==0.1.99
|
||||||
|
- sympy==1.12
|
||||||
|
- threadpoolctl==3.2.0
|
||||||
|
- tokenizers==0.14.1
|
||||||
|
- torch==2.1.0
|
||||||
|
- torchvision==0.16.0
|
||||||
|
- tqdm==4.66.1
|
||||||
|
- transformers==4.34.0
|
||||||
|
- triton==2.1.0
|
||||||
|
- typing-extensions==4.8.0
|
||||||
|
- urllib3==2.0.6
|
||||||
|
prefix: /opt/conda/envs/huggingface
|
14
extra/grpc/huggingface/run.sh
Executable file
14
extra/grpc/huggingface/run.sh
Executable file
@ -0,0 +1,14 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
##
|
||||||
|
## A bash script wrapper that runs the huggingface server with conda
|
||||||
|
|
||||||
|
export PATH=$PATH:/opt/conda/bin
|
||||||
|
|
||||||
|
# Activate conda environment
|
||||||
|
source activate huggingface
|
||||||
|
|
||||||
|
# get the directory where the bash script is located
|
||||||
|
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
|
||||||
|
|
||||||
|
python $DIR/huggingface.py $@
|
11
extra/grpc/huggingface/test.sh
Normal file
11
extra/grpc/huggingface/test.sh
Normal file
@ -0,0 +1,11 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
##
|
||||||
|
## A bash script wrapper that runs the huggingface server with conda
|
||||||
|
|
||||||
|
# Activate conda environment
|
||||||
|
source activate huggingface
|
||||||
|
|
||||||
|
# get the directory where the bash script is located
|
||||||
|
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
|
||||||
|
|
||||||
|
python -m unittest $DIR/test_huggingface.py
|
81
extra/grpc/huggingface/test_huggingface.py
Normal file
81
extra/grpc/huggingface/test_huggingface.py
Normal file
@ -0,0 +1,81 @@
|
|||||||
|
"""
|
||||||
|
A test script to test the gRPC service
|
||||||
|
"""
|
||||||
|
import unittest
|
||||||
|
import subprocess
|
||||||
|
import time
|
||||||
|
import backend_pb2
|
||||||
|
import backend_pb2_grpc
|
||||||
|
|
||||||
|
import grpc
|
||||||
|
|
||||||
|
|
||||||
|
class TestBackendServicer(unittest.TestCase):
|
||||||
|
"""
|
||||||
|
TestBackendServicer is the class that tests the gRPC service
|
||||||
|
"""
|
||||||
|
def setUp(self):
|
||||||
|
"""
|
||||||
|
This method sets up the gRPC service by starting the server
|
||||||
|
"""
|
||||||
|
self.service = subprocess.Popen(["python3", "huggingface.py", "--addr", "localhost:50051"])
|
||||||
|
|
||||||
|
def tearDown(self) -> None:
|
||||||
|
"""
|
||||||
|
This method tears down the gRPC service by terminating the server
|
||||||
|
"""
|
||||||
|
self.service.terminate()
|
||||||
|
self.service.wait()
|
||||||
|
|
||||||
|
def test_server_startup(self):
|
||||||
|
"""
|
||||||
|
This method tests if the server starts up successfully
|
||||||
|
"""
|
||||||
|
time.sleep(2)
|
||||||
|
try:
|
||||||
|
self.setUp()
|
||||||
|
with grpc.insecure_channel("localhost:50051") as channel:
|
||||||
|
stub = backend_pb2_grpc.BackendStub(channel)
|
||||||
|
response = stub.Health(backend_pb2.HealthMessage())
|
||||||
|
self.assertEqual(response.message, b'OK')
|
||||||
|
except Exception as err:
|
||||||
|
print(err)
|
||||||
|
self.fail("Server failed to start")
|
||||||
|
finally:
|
||||||
|
self.tearDown()
|
||||||
|
|
||||||
|
def test_load_model(self):
|
||||||
|
"""
|
||||||
|
This method tests if the model is loaded successfully
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
self.setUp()
|
||||||
|
with grpc.insecure_channel("localhost:50051") as channel:
|
||||||
|
stub = backend_pb2_grpc.BackendStub(channel)
|
||||||
|
response = stub.LoadModel(backend_pb2.ModelOptions(Model="bert-base-nli-mean-tokens"))
|
||||||
|
self.assertTrue(response.success)
|
||||||
|
self.assertEqual(response.message, "Model loaded successfully")
|
||||||
|
except Exception as err:
|
||||||
|
print(err)
|
||||||
|
self.fail("LoadModel service failed")
|
||||||
|
finally:
|
||||||
|
self.tearDown()
|
||||||
|
|
||||||
|
def test_embedding(self):
|
||||||
|
"""
|
||||||
|
This method tests if the embeddings are generated successfully
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
self.setUp()
|
||||||
|
with grpc.insecure_channel("localhost:50051") as channel:
|
||||||
|
stub = backend_pb2_grpc.BackendStub(channel)
|
||||||
|
response = stub.LoadModel(backend_pb2.ModelOptions(Model="bert-base-nli-mean-tokens"))
|
||||||
|
self.assertTrue(response.success)
|
||||||
|
embedding_request = backend_pb2.PredictOptions(Embeddings="This is a test sentence.")
|
||||||
|
embedding_response = stub.Embedding(embedding_request)
|
||||||
|
self.assertIsNotNone(embedding_response.embeddings)
|
||||||
|
except Exception as err:
|
||||||
|
print(err)
|
||||||
|
self.fail("Embedding service failed")
|
||||||
|
finally:
|
||||||
|
self.tearDown()
|
11
extra/grpc/vall-e-x/Makefile
Normal file
11
extra/grpc/vall-e-x/Makefile
Normal file
@ -0,0 +1,11 @@
|
|||||||
|
.PONY: ttsvalle
|
||||||
|
ttsvalle:
|
||||||
|
@echo "Creating virtual environment..."
|
||||||
|
@conda env create --name ttsvalle --file ttsvalle.yml
|
||||||
|
@echo "Virtual environment created."
|
||||||
|
|
||||||
|
.PONY: run
|
||||||
|
run:
|
||||||
|
@echo "Running ttsvalle..."
|
||||||
|
bash run.sh
|
||||||
|
@echo "ttsvalle run."
|
5
extra/grpc/vall-e-x/README.md
Normal file
5
extra/grpc/vall-e-x/README.md
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
# Creating a separate environment for the ttsvalle project
|
||||||
|
|
||||||
|
```
|
||||||
|
make ttsvalle
|
||||||
|
```
|
13
extra/grpc/vall-e-x/run.sh
Executable file
13
extra/grpc/vall-e-x/run.sh
Executable file
@ -0,0 +1,13 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
##
|
||||||
|
## A bash script wrapper that runs the ttsvalle server with conda
|
||||||
|
export PATH=$PATH:/opt/conda/bin
|
||||||
|
|
||||||
|
# Activate conda environment
|
||||||
|
source activate ttsvalle
|
||||||
|
|
||||||
|
# get the directory where the bash script is located
|
||||||
|
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
|
||||||
|
|
||||||
|
python $DIR/ttvalle.py $@
|
@ -1,14 +1,15 @@
|
|||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
import grpc
|
|
||||||
from concurrent import futures
|
from concurrent import futures
|
||||||
import time
|
|
||||||
import backend_pb2
|
|
||||||
import backend_pb2_grpc
|
|
||||||
import argparse
|
import argparse
|
||||||
import signal
|
import signal
|
||||||
import sys
|
import sys
|
||||||
import os
|
import os
|
||||||
from pathlib import Path
|
import time
|
||||||
|
import backend_pb2
|
||||||
|
import backend_pb2_grpc
|
||||||
|
|
||||||
|
import grpc
|
||||||
|
|
||||||
from utils.generation import SAMPLE_RATE, generate_audio, preload_models
|
from utils.generation import SAMPLE_RATE, generate_audio, preload_models
|
||||||
from scipy.io.wavfile import write as write_wav
|
from scipy.io.wavfile import write as write_wav
|
||||||
@ -21,9 +22,34 @@ MAX_WORKERS = int(os.environ.get('PYTHON_GRPC_MAX_WORKERS', '1'))
|
|||||||
|
|
||||||
# Implement the BackendServicer class with the service methods
|
# Implement the BackendServicer class with the service methods
|
||||||
class BackendServicer(backend_pb2_grpc.BackendServicer):
|
class BackendServicer(backend_pb2_grpc.BackendServicer):
|
||||||
|
"""
|
||||||
|
gRPC servicer for backend services.
|
||||||
|
"""
|
||||||
def Health(self, request, context):
|
def Health(self, request, context):
|
||||||
|
"""
|
||||||
|
Health check service.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: A backend_pb2.HealthRequest instance.
|
||||||
|
context: A grpc.ServicerContext instance.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
A backend_pb2.Reply instance with message "OK".
|
||||||
|
"""
|
||||||
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
|
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
|
||||||
|
|
||||||
def LoadModel(self, request, context):
|
def LoadModel(self, request, context):
|
||||||
|
"""
|
||||||
|
Load model service.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: A backend_pb2.LoadModelRequest instance.
|
||||||
|
context: A grpc.ServicerContext instance.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
A backend_pb2.Result instance with message "Model loaded successfully" and success=True if successful.
|
||||||
|
A backend_pb2.Result instance with success=False and error message if unsuccessful.
|
||||||
|
"""
|
||||||
model_name = request.Model
|
model_name = request.Model
|
||||||
try:
|
try:
|
||||||
print("Preparing models, please wait", file=sys.stderr)
|
print("Preparing models, please wait", file=sys.stderr)
|
||||||
@ -49,6 +75,17 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
|
|||||||
return backend_pb2.Result(message="Model loaded successfully", success=True)
|
return backend_pb2.Result(message="Model loaded successfully", success=True)
|
||||||
|
|
||||||
def TTS(self, request, context):
|
def TTS(self, request, context):
|
||||||
|
"""
|
||||||
|
Text-to-speech service.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: A backend_pb2.TTSRequest instance.
|
||||||
|
context: A grpc.ServicerContext instance.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
A backend_pb2.Result instance with success=True if successful.
|
||||||
|
A backend_pb2.Result instance with success=False and error message if unsuccessful.
|
||||||
|
"""
|
||||||
model = request.model
|
model = request.model
|
||||||
print(request, file=sys.stderr)
|
print(request, file=sys.stderr)
|
||||||
try:
|
try:
|
||||||
|
101
extra/grpc/vall-e-x/ttsvalle.yml
Normal file
101
extra/grpc/vall-e-x/ttsvalle.yml
Normal file
@ -0,0 +1,101 @@
|
|||||||
|
name: ttsvalle
|
||||||
|
channels:
|
||||||
|
- defaults
|
||||||
|
dependencies:
|
||||||
|
- _libgcc_mutex=0.1=main
|
||||||
|
- _openmp_mutex=5.1=1_gnu
|
||||||
|
- bzip2=1.0.8=h7b6447c_0
|
||||||
|
- ca-certificates=2023.08.22=h06a4308_0
|
||||||
|
- ld_impl_linux-64=2.38=h1181459_1
|
||||||
|
- libffi=3.4.4=h6a678d5_0
|
||||||
|
- libgcc-ng=11.2.0=h1234567_1
|
||||||
|
- libgomp=11.2.0=h1234567_1
|
||||||
|
- libstdcxx-ng=11.2.0=h1234567_1
|
||||||
|
- libuuid=1.41.5=h5eee18b_0
|
||||||
|
- ncurses=6.4=h6a678d5_0
|
||||||
|
- openssl=3.0.11=h7f8727e_2
|
||||||
|
- pip=23.2.1=py310h06a4308_0
|
||||||
|
- python=3.10.13=h955ad1f_0
|
||||||
|
- readline=8.2=h5eee18b_0
|
||||||
|
- setuptools=68.0.0=py310h06a4308_0
|
||||||
|
- sqlite=3.41.2=h5eee18b_0
|
||||||
|
- tk=8.6.12=h1ccaba5_0
|
||||||
|
- tzdata=2023c=h04d1e81_0
|
||||||
|
- wheel=0.41.2=py310h06a4308_0
|
||||||
|
- xz=5.4.2=h5eee18b_0
|
||||||
|
- zlib=1.2.13=h5eee18b_0
|
||||||
|
- pip:
|
||||||
|
- aiofiles==23.2.1
|
||||||
|
- altair==5.1.2
|
||||||
|
- annotated-types==0.6.0
|
||||||
|
- anyio==3.7.1
|
||||||
|
- click==8.1.7
|
||||||
|
- cn2an==0.5.22
|
||||||
|
- cython==3.0.3
|
||||||
|
- einops==0.7.0
|
||||||
|
- encodec==0.1.1
|
||||||
|
- eng-to-ipa==0.0.2
|
||||||
|
- fastapi==0.103.2
|
||||||
|
- ffmpeg-python==0.2.0
|
||||||
|
- ffmpy==0.3.1
|
||||||
|
- fsspec==2023.9.2
|
||||||
|
- future==0.18.3
|
||||||
|
- gradio==3.47.1
|
||||||
|
- gradio-client==0.6.0
|
||||||
|
- grpcio==1.59.0
|
||||||
|
- h11==0.14.0
|
||||||
|
- httpcore==0.18.0
|
||||||
|
- httpx==0.25.0
|
||||||
|
- huggingface-hub==0.17.3
|
||||||
|
- importlib-resources==6.1.0
|
||||||
|
- inflect==7.0.0
|
||||||
|
- jieba==0.42.1
|
||||||
|
- langid==1.1.6
|
||||||
|
- llvmlite==0.41.0
|
||||||
|
- more-itertools==10.1.0
|
||||||
|
- nltk==3.8.1
|
||||||
|
- numba==0.58.0
|
||||||
|
- numpy==1.25.2
|
||||||
|
- nvidia-cublas-cu12==12.1.3.1
|
||||||
|
- nvidia-cuda-cupti-cu12==12.1.105
|
||||||
|
- nvidia-cuda-nvrtc-cu12==12.1.105
|
||||||
|
- nvidia-cuda-runtime-cu12==12.1.105
|
||||||
|
- nvidia-cudnn-cu12==8.9.2.26
|
||||||
|
- nvidia-cufft-cu12==11.0.2.54
|
||||||
|
- nvidia-curand-cu12==10.3.2.106
|
||||||
|
- nvidia-cusolver-cu12==11.4.5.107
|
||||||
|
- nvidia-cusparse-cu12==12.1.0.106
|
||||||
|
- nvidia-nccl-cu12==2.18.1
|
||||||
|
- nvidia-nvjitlink-cu12==12.2.140
|
||||||
|
- nvidia-nvtx-cu12==12.1.105
|
||||||
|
- openai-whisper==20230306
|
||||||
|
- orjson==3.9.7
|
||||||
|
- proces==0.1.7
|
||||||
|
- protobuf==4.24.4
|
||||||
|
- pydantic==2.4.2
|
||||||
|
- pydantic-core==2.10.1
|
||||||
|
- pydub==0.25.1
|
||||||
|
- pyopenjtalk-prebuilt==0.3.0
|
||||||
|
- pypinyin==0.49.0
|
||||||
|
- python-multipart==0.0.6
|
||||||
|
- regex==2023.10.3
|
||||||
|
- safetensors==0.4.0
|
||||||
|
- semantic-version==2.10.0
|
||||||
|
- soundfile==0.12.1
|
||||||
|
- starlette==0.27.0
|
||||||
|
- sudachidict-core==20230927
|
||||||
|
- sudachipy==0.6.7
|
||||||
|
- tokenizers==0.14.1
|
||||||
|
- toolz==0.12.0
|
||||||
|
- torch==2.1.0
|
||||||
|
- torchaudio==2.1.0
|
||||||
|
- torchvision==0.16.0
|
||||||
|
- tqdm==4.66.1
|
||||||
|
- transformers==4.34.0
|
||||||
|
- triton==2.1.0
|
||||||
|
- unidecode==1.3.7
|
||||||
|
- uvicorn==0.23.2
|
||||||
|
- vocos==0.0.3
|
||||||
|
- websockets==11.0.3
|
||||||
|
- wget==3.2
|
||||||
|
prefix: /opt/conda/envs/ttsvalle
|
11
extra/grpc/vllm/Makefile
Normal file
11
extra/grpc/vllm/Makefile
Normal file
@ -0,0 +1,11 @@
|
|||||||
|
.PONY: vllm
|
||||||
|
vllm:
|
||||||
|
@echo "Creating virtual environment..."
|
||||||
|
@conda env create --name vllm --file vllm.yml
|
||||||
|
@echo "Virtual environment created."
|
||||||
|
|
||||||
|
.PONY: run
|
||||||
|
run:
|
||||||
|
@echo "Running vllm..."
|
||||||
|
bash run.sh
|
||||||
|
@echo "vllm run."
|
5
extra/grpc/vllm/README.md
Normal file
5
extra/grpc/vllm/README.md
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
# Creating a separate environment for the vllm project
|
||||||
|
|
||||||
|
```
|
||||||
|
make vllm
|
||||||
|
```
|
@ -1,15 +1,15 @@
|
|||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
import grpc
|
|
||||||
from concurrent import futures
|
from concurrent import futures
|
||||||
import time
|
import time
|
||||||
import backend_pb2
|
|
||||||
import backend_pb2_grpc
|
|
||||||
import argparse
|
import argparse
|
||||||
import signal
|
import signal
|
||||||
import sys
|
import sys
|
||||||
import os, glob
|
import os
|
||||||
|
|
||||||
from pathlib import Path
|
import backend_pb2
|
||||||
|
import backend_pb2_grpc
|
||||||
|
|
||||||
|
import grpc
|
||||||
from vllm import LLM, SamplingParams
|
from vllm import LLM, SamplingParams
|
||||||
|
|
||||||
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
|
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
|
||||||
@ -19,7 +19,20 @@ MAX_WORKERS = int(os.environ.get('PYTHON_GRPC_MAX_WORKERS', '1'))
|
|||||||
|
|
||||||
# Implement the BackendServicer class with the service methods
|
# Implement the BackendServicer class with the service methods
|
||||||
class BackendServicer(backend_pb2_grpc.BackendServicer):
|
class BackendServicer(backend_pb2_grpc.BackendServicer):
|
||||||
|
"""
|
||||||
|
A gRPC servicer that implements the Backend service defined in backend.proto.
|
||||||
|
"""
|
||||||
def generate(self,prompt, max_new_tokens):
|
def generate(self,prompt, max_new_tokens):
|
||||||
|
"""
|
||||||
|
Generates text based on the given prompt and maximum number of new tokens.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
prompt (str): The prompt to generate text from.
|
||||||
|
max_new_tokens (int): The maximum number of new tokens to generate.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
str: The generated text.
|
||||||
|
"""
|
||||||
self.generator.end_beam_search()
|
self.generator.end_beam_search()
|
||||||
|
|
||||||
# Tokenizing the input
|
# Tokenizing the input
|
||||||
@ -41,9 +54,31 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
|
|||||||
if token.item() == self.generator.tokenizer.eos_token_id:
|
if token.item() == self.generator.tokenizer.eos_token_id:
|
||||||
break
|
break
|
||||||
return decoded_text
|
return decoded_text
|
||||||
|
|
||||||
def Health(self, request, context):
|
def Health(self, request, context):
|
||||||
|
"""
|
||||||
|
Returns a health check message.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: The health check request.
|
||||||
|
context: The gRPC context.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
backend_pb2.Reply: The health check reply.
|
||||||
|
"""
|
||||||
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
|
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
|
||||||
|
|
||||||
def LoadModel(self, request, context):
|
def LoadModel(self, request, context):
|
||||||
|
"""
|
||||||
|
Loads a language model.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: The load model request.
|
||||||
|
context: The gRPC context.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
backend_pb2.Result: The load model result.
|
||||||
|
"""
|
||||||
try:
|
try:
|
||||||
if request.Quantization != "":
|
if request.Quantization != "":
|
||||||
self.llm = LLM(model=request.Model, quantization=request.Quantization)
|
self.llm = LLM(model=request.Model, quantization=request.Quantization)
|
||||||
@ -54,6 +89,16 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
|
|||||||
return backend_pb2.Result(message="Model loaded successfully", success=True)
|
return backend_pb2.Result(message="Model loaded successfully", success=True)
|
||||||
|
|
||||||
def Predict(self, request, context):
|
def Predict(self, request, context):
|
||||||
|
"""
|
||||||
|
Generates text based on the given prompt and sampling parameters.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: The predict request.
|
||||||
|
context: The gRPC context.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
backend_pb2.Result: The predict result.
|
||||||
|
"""
|
||||||
if request.TopP == 0:
|
if request.TopP == 0:
|
||||||
request.TopP = 0.9
|
request.TopP = 0.9
|
||||||
|
|
||||||
@ -68,6 +113,16 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
|
|||||||
return backend_pb2.Result(message=bytes(generated_text, encoding='utf-8'))
|
return backend_pb2.Result(message=bytes(generated_text, encoding='utf-8'))
|
||||||
|
|
||||||
def PredictStream(self, request, context):
|
def PredictStream(self, request, context):
|
||||||
|
"""
|
||||||
|
Generates text based on the given prompt and sampling parameters, and streams the results.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: The predict stream request.
|
||||||
|
context: The gRPC context.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
backend_pb2.Result: The predict stream result.
|
||||||
|
"""
|
||||||
# Implement PredictStream RPC
|
# Implement PredictStream RPC
|
||||||
#for reply in some_data_generator():
|
#for reply in some_data_generator():
|
||||||
# yield reply
|
# yield reply
|
||||||
|
14
extra/grpc/vllm/run.sh
Executable file
14
extra/grpc/vllm/run.sh
Executable file
@ -0,0 +1,14 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
##
|
||||||
|
## A bash script wrapper that runs the diffusers server with conda
|
||||||
|
|
||||||
|
export PATH=$PATH:/opt/conda/bin
|
||||||
|
|
||||||
|
# Activate conda environment
|
||||||
|
source activate vllm
|
||||||
|
|
||||||
|
# get the directory where the bash script is located
|
||||||
|
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
|
||||||
|
|
||||||
|
python $DIR/backend_vllm.py $@
|
41
extra/grpc/vllm/test_backend_vllm.py
Normal file
41
extra/grpc/vllm/test_backend_vllm.py
Normal file
@ -0,0 +1,41 @@
|
|||||||
|
import unittest
|
||||||
|
import subprocess
|
||||||
|
import time
|
||||||
|
import backend_pb2
|
||||||
|
import backend_pb2_grpc
|
||||||
|
|
||||||
|
import grpc
|
||||||
|
|
||||||
|
import unittest
|
||||||
|
import subprocess
|
||||||
|
import time
|
||||||
|
import grpc
|
||||||
|
import backend_pb2_grpc
|
||||||
|
import backend_pb2
|
||||||
|
|
||||||
|
class TestBackendServicer(unittest.TestCase):
|
||||||
|
"""
|
||||||
|
TestBackendServicer is the class that tests the gRPC service.
|
||||||
|
|
||||||
|
This class contains methods to test the startup and shutdown of the gRPC service.
|
||||||
|
"""
|
||||||
|
def setUp(self):
|
||||||
|
self.service = subprocess.Popen(["python", "backend_vllm.py", "--addr", "localhost:50051"])
|
||||||
|
|
||||||
|
def tearDown(self) -> None:
|
||||||
|
self.service.terminate()
|
||||||
|
self.service.wait()
|
||||||
|
|
||||||
|
def test_server_startup(self):
|
||||||
|
time.sleep(2)
|
||||||
|
try:
|
||||||
|
self.setUp()
|
||||||
|
with grpc.insecure_channel("localhost:50051") as channel:
|
||||||
|
stub = backend_pb2_grpc.BackendStub(channel)
|
||||||
|
response = stub.Health(backend_pb2.HealthMessage())
|
||||||
|
self.assertEqual(response.message, b'OK')
|
||||||
|
except Exception as err:
|
||||||
|
print(err)
|
||||||
|
self.fail("Server failed to start")
|
||||||
|
finally:
|
||||||
|
self.tearDown()
|
99
extra/grpc/vllm/vllm.yml
Normal file
99
extra/grpc/vllm/vllm.yml
Normal file
@ -0,0 +1,99 @@
|
|||||||
|
name: vllm
|
||||||
|
channels:
|
||||||
|
- defaults
|
||||||
|
dependencies:
|
||||||
|
- _libgcc_mutex=0.1=main
|
||||||
|
- _openmp_mutex=5.1=1_gnu
|
||||||
|
- bzip2=1.0.8=h7b6447c_0
|
||||||
|
- ca-certificates=2023.08.22=h06a4308_0
|
||||||
|
- ld_impl_linux-64=2.38=h1181459_1
|
||||||
|
- libffi=3.4.4=h6a678d5_0
|
||||||
|
- libgcc-ng=11.2.0=h1234567_1
|
||||||
|
- libgomp=11.2.0=h1234567_1
|
||||||
|
- libstdcxx-ng=11.2.0=h1234567_1
|
||||||
|
- libuuid=1.41.5=h5eee18b_0
|
||||||
|
- ncurses=6.4=h6a678d5_0
|
||||||
|
- openssl=3.0.11=h7f8727e_2
|
||||||
|
- pip=23.2.1=py311h06a4308_0
|
||||||
|
- python=3.11.5=h955ad1f_0
|
||||||
|
- readline=8.2=h5eee18b_0
|
||||||
|
- setuptools=68.0.0=py311h06a4308_0
|
||||||
|
- sqlite=3.41.2=h5eee18b_0
|
||||||
|
- tk=8.6.12=h1ccaba5_0
|
||||||
|
- wheel=0.41.2=py311h06a4308_0
|
||||||
|
- xz=5.4.2=h5eee18b_0
|
||||||
|
- zlib=1.2.13=h5eee18b_0
|
||||||
|
- pip:
|
||||||
|
- aiosignal==1.3.1
|
||||||
|
- anyio==3.7.1
|
||||||
|
- attrs==23.1.0
|
||||||
|
- certifi==2023.7.22
|
||||||
|
- charset-normalizer==3.3.0
|
||||||
|
- click==8.1.7
|
||||||
|
- cmake==3.27.6
|
||||||
|
- fastapi==0.103.2
|
||||||
|
- filelock==3.12.4
|
||||||
|
- frozenlist==1.4.0
|
||||||
|
- fsspec==2023.9.2
|
||||||
|
- grpcio==1.59.0
|
||||||
|
- h11==0.14.0
|
||||||
|
- httptools==0.6.0
|
||||||
|
- huggingface-hub==0.17.3
|
||||||
|
- idna==3.4
|
||||||
|
- jinja2==3.1.2
|
||||||
|
- jsonschema==4.19.1
|
||||||
|
- jsonschema-specifications==2023.7.1
|
||||||
|
- lit==17.0.2
|
||||||
|
- markupsafe==2.1.3
|
||||||
|
- mpmath==1.3.0
|
||||||
|
- msgpack==1.0.7
|
||||||
|
- networkx==3.1
|
||||||
|
- ninja==1.11.1
|
||||||
|
- numpy==1.26.0
|
||||||
|
- nvidia-cublas-cu11==11.10.3.66
|
||||||
|
- nvidia-cuda-cupti-cu11==11.7.101
|
||||||
|
- nvidia-cuda-nvrtc-cu11==11.7.99
|
||||||
|
- nvidia-cuda-runtime-cu11==11.7.99
|
||||||
|
- nvidia-cudnn-cu11==8.5.0.96
|
||||||
|
- nvidia-cufft-cu11==10.9.0.58
|
||||||
|
- nvidia-curand-cu11==10.2.10.91
|
||||||
|
- nvidia-cusolver-cu11==11.4.0.1
|
||||||
|
- nvidia-cusparse-cu11==11.7.4.91
|
||||||
|
- nvidia-nccl-cu11==2.14.3
|
||||||
|
- nvidia-nvtx-cu11==11.7.91
|
||||||
|
- packaging==23.2
|
||||||
|
- pandas==2.1.1
|
||||||
|
- protobuf==4.24.4
|
||||||
|
- psutil==5.9.5
|
||||||
|
- pyarrow==13.0.0
|
||||||
|
- pydantic==1.10.13
|
||||||
|
- python-dateutil==2.8.2
|
||||||
|
- python-dotenv==1.0.0
|
||||||
|
- pytz==2023.3.post1
|
||||||
|
- pyyaml==6.0.1
|
||||||
|
- ray==2.7.0
|
||||||
|
- referencing==0.30.2
|
||||||
|
- regex==2023.10.3
|
||||||
|
- requests==2.31.0
|
||||||
|
- rpds-py==0.10.4
|
||||||
|
- safetensors==0.4.0
|
||||||
|
- sentencepiece==0.1.99
|
||||||
|
- six==1.16.0
|
||||||
|
- sniffio==1.3.0
|
||||||
|
- starlette==0.27.0
|
||||||
|
- sympy==1.12
|
||||||
|
- tokenizers==0.14.1
|
||||||
|
- torch==2.0.1
|
||||||
|
- tqdm==4.66.1
|
||||||
|
- transformers==4.34.0
|
||||||
|
- triton==2.0.0
|
||||||
|
- typing-extensions==4.8.0
|
||||||
|
- tzdata==2023.3
|
||||||
|
- urllib3==2.0.6
|
||||||
|
- uvicorn==0.23.2
|
||||||
|
- uvloop==0.17.0
|
||||||
|
- vllm==0.2.0
|
||||||
|
- watchfiles==0.20.0
|
||||||
|
- websockets==11.0.3
|
||||||
|
- xformers==0.0.22
|
||||||
|
prefix: /opt/conda/envs/vllm
|
Loading…
x
Reference in New Issue
Block a user