Ettore Di Giacinto 35290e146b
fix(grammar): respect JSONmode and grammar from user input (#1935)
* fix(grammar): Fix JSON mode and custom grammar

* tests(aio): add jsonmode test

* tests(aio): add functioncall test

* fix(aio): use hermes-2-pro-mistral as llm for CPU profile

* add phi-2-orange
2024-03-31 13:04:09 +02:00
..

AIO CPU size

Use this image with CPU-only.

Please keep using only C++ backends so the base image is as small as possible (without CUDA, cuDNN, python, etc).