whisper.cpp/ggml
Changyeon Kim 307712a903 ggml: Add POOL2D OP for GPU acceleration to the Vulkan backend in the MobileVLM model. (llama/9763)
* ggml: Add POOL2D OP for GPU ACC to the Vulkan.

- The MobileVLM model now supports inference acceleration through GPU by utilizing the Vulkan backend.
- A GGML_OP_POOL_2D shader has been added. (Pooling)
- The encoding performance of the CLIP model improved from 2.8s on the CPU to 0.7s on the GPU.

Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com>

* [fix] Correct the incorrect order of the parameters.

fix casting to int.

Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com>

---------

Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com>
2024-11-15 15:21:04 +02:00
..
cmake whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
include ggml : add AMX backend (llama/8998) 2024-11-01 10:19:05 +02:00
src ggml: Add POOL2D OP for GPU acceleration to the Vulkan backend in the MobileVLM model. (llama/9763) 2024-11-15 15:21:04 +02:00
.gitignore whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
CMakeLists.txt add amx kernel for gemm (llama/8998) 2024-11-01 10:19:05 +02:00
ggml_vk_generate_shaders.py whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00