Georgi Gerganov e54329da7b ggml : full ALiBi support (llama/7192)
* ggml : full ALiBi support

* ggml : update ggml_soft_max_ext() CUDA, SYCL

* ggml : ggml_flash_attn_ext() support ALiBi (CPU)

* ggml : ggml_flash_attn_ext() support ALiBi (Metal)

* ggml : fix warning

* ggml : ggml_flash_attn_ext() support ALiBi (CUDA)

ggml-ci

* ggml : fix assert message

* vulkan : add dev notes

* ggml : require mask when using ALiBi

ggml-ci

* convert : fix convert for refact models
2024-05-13 11:02:26 +03:00
..
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-04-07 16:15:57 +03:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00
2024-03-27 18:55:10 +02:00