LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2025-03-23 12:35:20 +00:00

History

feat: include tokens usage for streamed output (#4282 )

Use pb.Reply instead of []byte with Reply.GetMessage() in llama grpc to get the proper usage data in reply streaming mode at the last [DONE] frame

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

2024-11-28 14:47:56 +01:00

backend_suite_test.go

feat: extract output with regexes from LLMs (#3491 )

2024-09-13 13:27:36 +02:00

embeddings.go

chore(refactor): drop unnecessary code in loader (#4096 )

2024-11-08 21:54:25 +01:00

image.go

chore(refactor): drop unnecessary code in loader (#4096 )

2024-11-08 21:54:25 +01:00

llm_test.go

feat: extract output with regexes from LLMs (#3491 )

2024-09-13 13:27:36 +02:00

llm.go

feat: include tokens usage for streamed output (#4282 )

2024-11-28 14:47:56 +01:00

options.go

chore(refactor): drop unnecessary code in loader (#4096 )

2024-11-08 21:54:25 +01:00

rerank.go

chore(refactor): drop unnecessary code in loader (#4096 )

2024-11-08 21:54:25 +01:00

soundgeneration.go

chore(refactor): drop unnecessary code in loader (#4096 )

2024-11-08 21:54:25 +01:00

stores.go

chore(refactor): drop unnecessary code in loader (#4096 )

2024-11-08 21:54:25 +01:00

token_metrics.go

chore(refactor): drop unnecessary code in loader (#4096 )

2024-11-08 21:54:25 +01:00

tokenize.go

chore(refactor): drop unnecessary code in loader (#4096 )

2024-11-08 21:54:25 +01:00

transcript.go

chore(refactor): drop unnecessary code in loader (#4096 )

2024-11-08 21:54:25 +01:00

tts.go

chore(refactor): drop unnecessary code in loader (#4096 )

2024-11-08 21:54:25 +01:00