metal : use autoreleasepool to avoid memory leaks (llama/5437)

There appears to be a known memory leak when using the
`MLTCommandBuffer`. It is suggested to use `@autoreleasepool` in
[1,2]

[1] https://developer.apple.com/forums/thread/662721
[2] https://forums.developer.apple.com/forums/thread/120931

This change-set wraps the `ggml_metal_graph_compute` in a
`@autoreleasepool`.

This commit addresses https://github.com/ggerganov/llama.cpp/issues/5436
This commit is contained in:
Ian Bull 2024-02-10 02:53:28 -08:00 committed by Georgi Gerganov
parent 1d3270cc8f
commit 47dfe9d4db
No known key found for this signature in database
GPG Key ID: 449E073F9DC10735

View File

@ -696,6 +696,7 @@ static bool ggml_metal_graph_compute(
struct ggml_metal_context * ctx, struct ggml_metal_context * ctx,
struct ggml_cgraph * gf) { struct ggml_cgraph * gf) {
@autoreleasepool {
MTLComputePassDescriptor * edesc = MTLComputePassDescriptor.computePassDescriptor; MTLComputePassDescriptor * edesc = MTLComputePassDescriptor.computePassDescriptor;
edesc.dispatchType = MTLDispatchTypeSerial; edesc.dispatchType = MTLDispatchTypeSerial;
@ -2281,6 +2282,7 @@ static bool ggml_metal_graph_compute(
[[MTLCaptureManager sharedCaptureManager] stopCapture]; [[MTLCaptureManager sharedCaptureManager] stopCapture];
} }
}
return true; return true;
} }