whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2024-12-22 05:57:48 +00:00

Author	SHA1	Message	Date
Radoslav Gerganov	6cc3b022ee	llama : offload to RPC in addition to other backends (llama/7640) * llama : offload to RPC in addition to other backends * - fix copy_tensor being called on the src buffer instead of the dst buffer - always initialize views in the view_src buffer - add RPC backend to Makefile build - add endpoint to all RPC object names * add rpc-server to Makefile * Update llama.cpp Co-authored-by: slaren <slarengh@gmail.com> --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-06-16 18:19:48 +03:00
Radoslav Gerganov	39b0640b09	rpc : resource management rework (llama/7562) * rpc : resource management rework * address review comments	2024-06-16 18:19:48 +03:00
Radoslav Gerganov	caeeb32b41	rpc : track allocated buffers (llama/7411) * rpc : track allocated buffers ref: #7407 * rpc : pack rpc_tensor tightly	2024-06-16 18:19:48 +03:00
Radoslav Gerganov	77d708fabb	rpc : set SO_REUSEADDR for the server socket (llama/7320) ref: #7293	2024-06-16 18:19:48 +03:00
Radoslav Gerganov	7bd69349bf	rpc : add command line arg for specifying backend memory ref: #7293	2024-06-16 18:19:48 +03:00
Radoslav Gerganov	c451080c8b	ggml : add RPC backend (llama/6829) * ggml : add RPC backend The RPC backend proxies all operations to a remote server which runs a regular backend (CPU, CUDA, Metal, etc). * set TCP_NODELAY * add CI workflows * Address review comments * fix warning * implement llama_max_devices() for RPC * Address review comments * Address review comments * wrap sockfd into a struct * implement get_alignment and get_max_size * add get_device_memory * fix warning * win32 support * add README * readme : trim trailing whitespace * Address review comments * win32 fix * Address review comments * fix compile warnings on macos	2024-05-14 19:16:29 +03:00

6 Commits