LocalAI/pkg
mintyleaf 0d6c3a7d57
feat: include tokens usage for streamed output (#4282)
Use pb.Reply instead of []byte with Reply.GetMessage() in llama grpc to get the proper usage data in reply streaming mode at the last [DONE] frame

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-11-28 14:47:56 +01:00
..
assets chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
concurrency chore: update jobresult_test.go (#4124) 2024-11-12 08:52:18 +01:00
downloader test: preliminary tests and merge fix for authv2 (#3584) 2024-09-24 09:32:48 +02:00
functions feat(openai): add json_schema format type and strict mode (#3193) 2024-08-07 15:27:02 -04:00
grpc feat: include tokens usage for streamed output (#4282) 2024-11-28 14:47:56 +01:00
langchain feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants (#2232) 2024-05-04 17:56:12 +02:00
library rf: centralize base64 image handling (#2595) 2024-06-24 08:34:36 +02:00
model feat(backends): Drop bert.cpp (#4272) 2024-11-27 16:34:28 +01:00
oci chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
stablediffusion feat: support upscaled image generation with esrgan (#509) 2023-06-05 17:21:38 +02:00
startup chore(tests): fix examples url 2024-10-30 10:57:21 +01:00
store chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
templates feat(templates): use a single template for multimodals messages (#3892) 2024-10-22 09:34:05 +02:00
tinydream feat: add tiny dream stable diffusion support (#1283) 2023-12-24 19:27:24 +00:00
utils feat(tts): Implement naive response_format for tts endpoint (#4035) 2024-11-02 19:13:35 +00:00
xsync chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
xsysinfo feat(default): use number of physical cores as default (#2483) 2024-06-04 15:23:29 +02:00