LocalAI/pkg
mintyleaf 2bc4b56a79
feat: stream tokens usage (#4415)
* Use pb.Reply instead of []byte with Reply.GetMessage() in llama grpc to get the proper usage data in reply streaming mode at the last [DONE] frame

* Fix 'hang' on empty message from the start

Seems like that empty message marker trick was unnecessary

---------

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-12-18 09:48:50 +01:00
..
assets chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
concurrency chore: update jobresult_test.go (#4124) 2024-11-12 08:52:18 +01:00
downloader test: preliminary tests and merge fix for authv2 (#3584) 2024-09-24 09:32:48 +02:00
functions feat(openai): add json_schema format type and strict mode (#3193) 2024-08-07 15:27:02 -04:00
grpc feat: stream tokens usage (#4415) 2024-12-18 09:48:50 +01:00
langchain feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants (#2232) 2024-05-04 17:56:12 +02:00
library rf: centralize base64 image handling (#2595) 2024-06-24 08:34:36 +02:00
model feat(template): read jinja templates from gguf files (#4332) 2024-12-08 13:50:33 +01:00
oci chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
stablediffusion feat: support upscaled image generation with esrgan (#509) 2023-06-05 17:21:38 +02:00
startup chore(tests): fix examples url 2024-10-30 10:57:21 +01:00
store chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
templates feat(template): read jinja templates from gguf files (#4332) 2024-12-08 13:50:33 +01:00
tinydream feat: add tiny dream stable diffusion support (#1283) 2023-12-24 19:27:24 +00:00
utils feat(tts): Implement naive response_format for tts endpoint (#4035) 2024-11-02 19:13:35 +00:00
xsync chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
xsysinfo feat(default): use number of physical cores as default (#2483) 2024-06-04 15:23:29 +02:00