whisper.cpp

ExternalVendorCode/whisper.cpp

Fork 0

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-06-05 08:50:41 +00:00

Commit Graph

Select branches

Hide Pull Requests

arghh

avx512

batched

bench-memcpy

chess

ci/env

coreml-with-state

cuda-cublas-opts

diarization

distil-support

experiment/model-compression

fa-decoder

feature/debug-gradle-signing

fix-bench

fix-coreml-ane

fix-vzip

fix_vs_sdl2

gg/alloc-enc-results

gg/chess

gg/ci-cuda-fix

gg/ci-fix-android

gg/ci-fix-windows

gg/cuda-fix-mmvq

gg/cuda-no-async

gg/disable-cuda-graphs

gg/fix-external-encoder

gg/hipblas-fix

gg/make-fix-glob

gg/objc

gg/prompt-tokens

gg/reduce-ctx-use

gg/wchess

gg/whisper-short-audio-check

ggml-backend

ggml-backend-no-sched

ggml-conv

grammar-debug

guided

java-bindings

large-v3

llama-podcast

macros-cvt-fp16

master

metal

metal-and-alloc

nvblas

parallel-states

quantize-encoder

stream

sync-ggml-25-04-02-2

sync-ggml-25-05-07

sync-ggml-25-05-13

talk.llama-coreml

threads

timing

try-fix-abort

word-ts-2

#1001

#1002

#1003

#1010

#1012

#1015

#102

#1021

#1021

#1024

#1027

#1029

#1031

#1032

#1034

#1037

#1041

#1042

#1045

#1046

#1049

#1054

#1058

#1060

#1062

#1064

#1067

#107

#1074

#1074

#1077

#1081

#1086

#1086

#1092

#1097

#1097

#110

#1101

#111

#1110

#1111

#1112

#1113

#1114

#1115

#1118

#1118

#1120

#1124

#1128

#1129

#1130

#1131

#1134

#1136

#1137

#114

#1142

#1143

#1144

#1147

#1148

#115

#1154

#116

#1160

#1162

#1164

#1164

#1173

#1174

#1196

#1204

#1205

#1209

#121

#1210

#1211

#1212

#1214

#1216

#1217

#1218

#1220

#1224

#1227

#1228

#1229

#123

#1231

#1235

#1238

#124

#1243

#1247

#1250

#1251

#1253

#1254

#1255

#1261

#1261

#1263

#1264

#1265

#1267

#127

#127

#1270

#1275

#128

#1286

#1290

#1293

#1294

#1298

#130

#130

#1303

#1304

#1305

#1306

#131

#1310

#1313

#1317

#1330

#1334

#1335

#1345

#1349

#135

#1350

#1352

#1356

#1358

#136

#1362

#1364

#1368

#1370

#1375

#1375

#1380

#1381

#1381

#1382

#1389

#1400

#1404

#141

#1415

#1417

#1418

#1418

#1420

#1422

#1424

#143

#1432

#1434

#1440

#1441

#1442

#1444

#1445

#1452

#1455

#1455

#1456

#1457

#1458

#1459

#1462

#1466

#1467

#147

#1472

#1473

#1474

#1475

#1478

#1478

#1479

#1484

#1485

#1486

#1487

#1492

#1493

#1499

#1499

#150

#1500

#1500

#1501

#1505

#1519

#1521

#1522

#1523

#1524

#1524

#1529

#1530

#1533

#1534

#1535

#1539

#1541

#1544

#1545

#1546

#1547

#1548

#1549

#1549

#155

#1551

#1554

#1559

#1559

#1560

#1561

#1563

#1563

#1565

#1567

#1568

#1574

#1575

#1576

#1578

#1582

#1583

#1586

#1588

#1589

#1595

#160

#1602

#1604

#1604

#1605

#1606

#1607

#1615

#1617

#1627

#1627

#163

#1633

#1649

#1649

#1650

#1651

#1655

#1658

#1667

#1669

#1672

#1673

#1674

#1675

#1677

#1679

#1679

#1681

#1691

#1692

#1694

#1695

#170

#1701

#1703

#1704

#1713

#1714

#1716

#1717

#1725

#1727

#1728

#1729

#1735

#174

#1740

#1741

#1744

#1747

#1749

#175

#1750

#1753

#1754

#1755

#1758

#1763

#1764

#1765

#1768

#1768

#1772

#1774

#1778

#1781

#1785

#179

#1791

#1791

#1792

#1802

#1806

#1809

#1812

#1813

#1819

#1823

#1823

#183

#1833

#1833

#1838

#1839

#1840

#1841

#1841

#1842

#1850

#1854

#1854

#1857

#1859

#1860

#1861

#1863

#1865

#1871

#1872

#1874

#1878

#1888

#1889

#1890

#1891

#1895

#1897

#19

#1902

#1913

#1913

#1917

#1924

#1924

#1925

#1926

#1928

#1929

#193

#1932

#1933

#1938

#194

#1942

#1943

#1944

#1945

#1947

#195

#1952

#1952

#1953

#1964

#1965

#1966

#1969

#1969

#1970

#1973

#1973

#1978

#1980

#1981

#1982

#1983

#1990

#1990

#1994

#1997

#1998

#20

#2000

#2001

#2004

#2005

#2005

#201

#2012

#2019

#2020

#2024

#2025

#2026

#203

#203

#2043

#2044

#2045

#2048

#2049

#2054

#2058

#2063

#2068

#2068

#2069

#2070

#2071

#2071

#2072

#2073

#2075

#2075

#2080

#2086

#2088

#2090

#2094

#2095

#2095

#21

#2100

#2102

#2108

#2115

#2119

#2121

#2123

#2127

#2127

#2128

#2129

#2133

#2138

#2142

#2152

#2153

#2154

#2166

#2170

#2181

#2182

#2184

#2184

#2189

#2194

#2196

#2198

#2206

#2208

#2217

#222

#2220

#2227

#2231

#2232

#2234

#2235

#2236

#2237

#2238

#2239

#224

#2240

#2242

#2254

#2254

#2256

#2261

#2264

#2266

#2267

#2270

#2272

#2272

#2279

#2279

#228

#2288

#229

#2290

#2291

#2294

#2299

#23

#230

#2302

#231

#2311

#2324

#2330

#2336

#2339

#2342

#2343

#2346

#2350

#2358

#2360

#2367

#2369

#2369

#2376

#2382

#2383

#2384

#2384

#2386

#2387

#239

#2391

#2393

#2396

#24

#2401

#2406

#2406

#2407

#2410

#2414

#2416

#2417

#2419

#2424

#2425

#2427

#2429

#2431

#2432

#2432

#2433

#2440

#2443

#2444

#2449

#245

#2451

#2455

#2464

#2475

#2477

#2481

#2484

#2485

#2488

#2489

#2495

#2505

#2506

#2511

#2515

#2516

#2517

#2518

#2519

#252

#2523

#2525

#2528

#2529

#253

#2534

#254

#2543

#2546

#2547

#2548

#2549

#2550

#2551

#2555

#2560

#2560

#2561

#2562

#2567

#2569

#257

#2570

#2573

#2574

#2576

#2577

#2577

#2579

#2580

#2585

#2589

#2593

#2593

#260

#2604

#2608

#2611

#2613

#2617

#2623

#2624

#2625

#2629

#2633

#2634

#2634

#2635

#2637

#2638

#2639

#2641

#2642

#2643

#2648

#2649

#2653

#2654

#2656

#2659

#2663

#2664

#2670

#2674

#2676

#2683

#2684

#2684

#2686

#2687

#2690

#2690

#2691

#2691

#2692

#2693

#2694

#2694

#2699

#27

#2700

#2707

#2709

#271

#2711

#2716

#2718

#2728

#273

#2734

#2736

#2737

#274

#2745

#2749

#2756

#2759

#2760

#2769

#2769

#277

#2770

#2777

#2779

#2790

#2796

#2797

#2799

#28

#2800

#2800

#2816

#282

#2821

#2822

#2824

#2826

#2826

#2831

#2831

#2832

#2832

#2836

#2838

#2838

#284

#284

#2840

#2842

#2842

#2843

#2844

#2845

#2846

#285

#2851

#2853

#2855

#2858

#286

#2862

#2863

#2868

#287

#2873

#2875

#2876

#2877

#2878

#2879

#288

#2880

#2882

#2887

#2889

#2891

#2893

#2895

#2896

#29

#2900

#2902

#2904

#2905

#2908

#291

#2910

#2911

#2912

#2914

#2915

#2916

#2918

#2919

#2921

#2923

#2924

#2925

#2932

#2935

#2937

#2938

#2939

#294

#2941

#2942

#2943

#2945

#2946

#2947

#2948

#2949

#2951

#2952

#2953

#2955

#2956

#2958

#2959

#296

#2960

#2962

#2966

#2968

#2969

#2971

#2972

#2973

#2975

#2976

#2977

#2979

#298

#2981

#2985

#2986

#2987

#2988

#299

#2990

#2991

#2992

#2993

#2994

#2997

#2999

#3

#3000

#3001

#3002

#3004

#3005

#3006

#3007

#301

#3016

#302

#3021

#3022

#3024

#3025

#3027

#3028

#3029

#3031

#3033

#3038

#3042

#3043

#3044

#3045

#3050

#3052

#3054

#3054

#3055

#3056

#3057

#306

#3060

#3062

#3064

#3065

#3068

#3069

#3070

#3071

#3073

#3075

#3076

#308

#3082

#3083

#3084

#3085

#3086

#3087

#3090

#3097

#3098

#31

#3100

#3101

#3102

#3103

#3104

#3106

#3108

#3109

#3112

#3114

#3114

#3120

#3124

#3125

#3126

#3127

#3130

#3131

#3132

#3133

#3134

#3136

#3138

#3140

#3141

#3142

#3143

#3145

#3147

#3148

#3149

#3150

#3151

#3152

#3156

#3157

#3158

#3160

#3160

#3163

#3164

#317

#3170

#3171

#3172

#3173

#3175

#3177

#3178

#3179

#318

#3180

#3181

#3181

#3183

#3184

#3185

#3186

#3187

#3189

#319

#3190

#3191

#3192

#3193

#3195

#3196

#3197

#3199

#3199

#320

#3200

#3200

#3201

#3202

#3203

#3206

#3208

#3209

#3214

#3215

#3217

#3218

#3218

#3219

#322

#3220

#3221

#3222

#3223

#3223

#3229

#3229

#323

#324

#331

#336

#34

#340

#343

#343

#345

#346

#349

#350

#351

#353

#357

#359

#36

#362

#365

#366

#368

#369

#379

#38

#381

#383

#384

#387

#388

#390

#391

#398

#404

#409

#41

#415

#42

#424

#425

#43

#431

#435

#436

#439

#443

#444

#446

#451

#453

#454

#454

#455

#456

#459

#461

#462

#468

#473

#474

#476

#482

#484

#485

#486

#494

#495

#497

#500

#501

#502

#502

#503

#506

#515

#520

#523

#532

#534

#537

#538

#540

#542

#552

#563

#566

#569

#572

#576

#58

#583

#60

#600

#605

#613

#613

#615

#619

#624

#624

#626

#627

#628

#629

#629

#638

#640

#642

#645

#648

#649

#650

#650

#659

#659

#664

#668

#67

#677

#682

#685

#686

#687

#688

#697

#70

#704

#706

#710

#711

#712

#716

#718

#72

#720

#721

#725

#728

#733

#737

#739

#740

#755

#759

#760

#763

#764

#768

#77

#776

#78

#798

#81

#810

#811

#812

#815

#816

#832

#833

#834

#835

#836

#837

#842

#845

#853

#854

#862

#863

#867

#87

#871

#871

#874

#875

#883

#885

#890

#891

#893

#899

#902

#908

#910

#915

#926

#927

#931

#935

#939

#939

#94

#944

#95

#956

#964

#968

#968

#971

#971

#972

#995

0.0.5-3

0.0.6-1

1.0.3

1.0.4

1.1.0

1.4.1-1

1.4.1-2

1.5.2

b2365

v1.0.4

v1.1.0

v1.1.1

v1.2.0

v1.2.1

v1.3.0

v1.4.0

v1.4.1

v1.4.2

v1.4.3

v1.5.0

v1.5.1

v1.5.2

v1.5.3

v1.5.4

v1.5.5

v1.6.0

v1.6.1

v1.6.2

v1.7.0

v1.7.1

v1.7.2

v1.7.2-pre

v1.7.3

v1.7.3-pre

v1.7.4

v1.7.4-pre-0

v1.7.4-pre-1

v1.7.5

799eacdde4

ruby : Add parallel transcription support (#3222) master KITAITI Makoto 2025-06-04 14:50:18 +09:00
82f461eaa4

ci : add mirror for ports.ubuntu.com (ARM packages) (#3221) Daniel Bevenius 2025-06-03 07:56:58 +02:00
269dad68a2

bindings.java : apply whisperParams in fullTranscribeWithTime instead of ignoring them (#3201) Joas Dev 2025-06-02 23:15:21 -05:00
121d27a495

musa: correct MUSA SDK rc4.0.1 download URL (#3217) R0CKSTAR 2025-06-03 12:02:12 +08:00
e05af2457b

ci : use mirrors.kernel.org for Ubuntu packages (#3220) Daniel Bevenius 2025-06-02 16:46:40 +02:00
b505539670

node : add language detection support (#3190) Daniel Bevenius 2025-06-02 14:58:05 +02:00
7fd6fa8097 talk-llama : sync llama.cpp Georgi Gerganov 2025-06-01 14:07:36 +03:00
3f46282cbe sync : ggml Georgi Gerganov 2025-06-01 14:03:21 +03:00
1e16340f4b threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (llama/12995) Max Krasnyansky 2025-05-31 15:39:19 -07:00
4a50254998 CUDA: add a prop in ggml_cuda_device_infor for distinguish iGPU or dGPU in cuda (#13856) (llama/13895) Shawn yang 2025-05-31 14:48:04 +08:00
a5aff28198 CUDA: fix typo in FlashAttention code (llama/13926) Johannes Gäßler 2025-05-30 21:22:03 +02:00
6c0472ab8f sched : avoid changing cur_copy when a graph is already allocated (llama/13922) Diego Devesa 2025-05-30 09:56:19 -07:00
b14cee184a cuda : prevent using split buffers with 3d/4d matrices (llama/13919) Diego Devesa 2025-05-30 07:37:18 -07:00
f7f92d0aab SYCL: Add mrope kernel (llama/13755) Akarshan Biswas 2025-05-30 19:40:57 +05:30
1893359cfd cmake: Guard GGML_CPU_ALL_VARIANTS by architecture (llama/13890) Christian Kastner 2025-05-30 01:28:54 +02:00
ea643c6ae3 arm64: optimize q4_k_q8_k kernel with i8mm (llama/13886) Yibo Cai 2025-05-29 19:39:20 +08:00
1d7b3c79f4 cmake: Factor out CPU architecture detection (llama/13883) Christian Kastner 2025-05-29 12:50:25 +02:00
ccfaac2bb0 ggml: aarch64: Implement SVE F32 kernels for Mamba Sequential Scan Algorithm (llama/13882) Vineel Abhinav 2025-05-29 14:48:43 +05:30
1230d37bca ggml: aarch64: Implement SVE F32 kernels for vector functions (llama/13843) Vineel Abhinav 2025-05-29 11:31:33 +05:30
9a500394ad CUDA: fix FA tg at long context for CC >= 8.9 (llama/13852) Johannes Gäßler 2025-05-28 13:33:37 +02:00
0035b8527c CANN: Add SOC TYPE printing in cmake configuration (llama/13837) leo-pony 2025-05-28 11:54:20 +08:00
3623186312 opencl: add new ops - argsort, div, sub, addrows, sigmoid, group_norm (llama/13787) lhez 2025-05-27 12:56:08 -07:00
67beac47f3 opencl: mark mul_mat f32f32 as supporting non-contiguous tensors (llama/13790) lhez 2025-05-27 12:53:14 -07:00
47a19bae25 vulkan: use timestamp queries for GGML_VULKAN_PERF (llama/13817) Jeff Bolz 2025-05-27 11:39:07 -05:00
3d5c7ca4bc SYCL: add gelu_erf kernel (llama/13749) Akarshan Biswas 2025-05-27 20:52:59 +05:30
4dfb2c2215 ggml : add ggml_repeat_4d (llama/13824) Xuan-Son Nguyen 2025-05-27 15:53:55 +02:00
ad433403ce vulkan : Remove unexpected ; (ggml/1253) Kai Pastor 2025-05-31 12:49:55 +02:00
4064dd6484 cmake : Fix broken CMake error messages (ggml/1252) Kai Pastor 2025-05-31 12:39:19 +02:00
fd75c4995b ggml : remove ggml_graph_import and ggml_graph_export declarations (ggml/1247) Radoslav Gerganov 2025-05-30 09:11:09 +03:00
0251445005

ruby : add Core ML support (#3214) KITAITI Makoto 2025-06-01 18:16:02 +09:00
98dfe8dc26

vad : revisit timestamp alignment/mapping (#3173) Daniel Bevenius 2025-05-30 06:28:46 +02:00
e5e900dd00

ruby : handle build options on installation (#3206) KITAITI Makoto 2025-05-30 01:32:49 +09:00
4d18e52f55

ggml : Fix backtrace breaking Windows build (#3203) Daniel Tang 2025-05-29 06:26:58 -04:00
ca890f566f sync : ggml Georgi Gerganov 2025-05-29 09:49:46 +03:00
48dddbbac1 ggml : install dynamic backends (ggml/1240) Radoslav Gerganov 2025-05-29 09:49:27 +03:00
5ea2c37a4c ggml : Print backtrace on uncaught C++ exceptions (ggml/1232) Daniel Tang 2025-05-27 20:58:46 -04:00
73a8c5fb94

whisper : remove whisper_load_backends function (#3196) Daniel Bevenius 2025-05-29 08:03:17 +02:00
1f5fdbecb4

ruby : add VAD support, migration to Ruby's newer API (#3197) KITAITI Makoto 2025-05-28 20:05:12 +09:00
5720426d97

whisper : install shared libs when using GGML_BACKEND_DL (#3195) Simon Booth 2025-05-28 09:15:04 +01:00
b9d27b1358

tests : add a new benchmark test for long-form audio (#3185) Fujimoto Seiji 2025-05-28 14:08:44 +09:00
0ed00d9d30

ci : update windows-blas uploads action (#3192) Daniel Bevenius 2025-05-27 18:01:31 +02:00
527fe6aaeb sync : fix builds - musa, ruby Georgi Gerganov 2025-05-27 18:02:37 +03:00
26eb48cb08 talk-llama : sync llama.cpp Georgi Gerganov 2025-05-27 17:08:24 +03:00
546928c33f sync : ggml Georgi Gerganov 2025-05-27 17:07:06 +03:00
15ae9dc2a4 ggml : riscv: add xtheadvector support (llama/13720) xctan 2025-05-27 21:21:36 +08:00
2e7a1e3e43 ggml-cpu: x86 feature detection is specific to x86 (llama/13811) Christian Kastner 2025-05-27 13:18:39 +02:00
b75babebb2 ggml : allow CUDA graphs when using pipeline parallelism (llama/13814) Diego Devesa 2025-05-27 04:05:18 -07:00
cc7a0105ef cuda : avoid cuGetErrorString (llama/13791) Georgi Gerganov 2025-05-26 22:14:52 +03:00
195fde8804 SYCL: Add non contiguous support in RMS_NORM and NORM kernels (llama/13611) Akarshan Biswas 2025-05-26 21:10:36 +05:30
25e27904ca sycl: Add more debug prints (llama/13640) Romain Biessy 2025-05-26 10:28:53 +02:00
474f7be8b6 vulkan: mark IM2COL as supporting non-contig (llama/13783) Jeff Bolz 2025-05-25 23:02:07 -05:00
e35fecc2a1 CANN: Add the basic supports of Flash Attention kernel (llama/13627) Bizhao Shi 2025-05-26 10:20:18 +08:00
1cd7028428 SYCL: revert "sycl: simplify bin_bcast_kernel (ggml/13383)" (llama/13752) Akarshan Biswas 2025-05-25 12:38:37 +05:30
99596d6031 ggml-cpu : set openmp wait time if not set (llama/13758) Diego Devesa 2025-05-24 13:26:47 -07:00
2d6c6862f7 ggml : add ggml_gelu_erf() CUDA kernel (llama/13719) Xuan-Son Nguyen 2025-05-24 13:06:47 +02:00
f1576b2659 CUDA: fix race condition in FA vector kernels (llama/13742) Johannes Gäßler 2025-05-24 11:46:19 +02:00
994b4f86ab CANN: Support MUL_MAT_ID for q8_0 and q4_0 (llama/13705) Chenguang Li 2025-05-23 16:47:53 +08:00
3e7eaccf55 ggml : fix the order of ggml_unary_op (llama/13718) Xuan-Son Nguyen 2025-05-23 08:12:48 +02:00
191f040414 vulkan: support CPY from any type to itself (llama/13695) Jeff Bolz 2025-05-23 00:45:02 -04:00
2d49d4a9b5 vulkan: Disable coopmat/coopmat2/bfloat extensions if glslc doesn't support it (llama/13696) Jeff Bolz 2025-05-23 00:33:45 -04:00
000d65befb use LOG_WARN to replace std::cerr (llama/13657) Judd 2025-05-23 12:33:08 +08:00
f0803e6646 sycl : Remove waits from function calls (llama/13702) Nicolò Scipione 2025-05-22 13:54:43 +02:00
730a00be8a SYCL: Avoid using with SYCL-Graph for unsupported nodes (llama/13587) Ewan Crawford 2025-05-22 09:24:09 +01:00
316600e8ee opencl: Add support for multiple devices (llama/12622) Henry Linjamäki 2025-05-22 02:21:45 +03:00
42f2b3bb65 opencl: fix couple crashes (llama/12795) Henry Linjamäki 2025-05-21 23:21:17 +03:00
dd6ef64060 ggml : add ggml_gelu_erf() (llama/13667) Xuan-Son Nguyen 2025-05-21 16:26:33 +02:00
131ee546ca musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (llama/13647) R0CKSTAR 2025-05-21 09:58:49 +08:00
4712f7b663 vulkan: fix warnings (llama/13626) Eve 2025-05-20 21:35:16 +00:00
926fe234e9 CUDA: skip fully masked-out KV in FA vec kernel (llama/13584) Johannes Gäßler 2025-05-20 14:45:07 +02:00
f44b53480f sycl: disable reorder for sycl mulmat (llama/13536) Svetlozar Georgiev 2025-05-20 10:34:15 +01:00
e04e8f1c79 metal : fix typo in FA kernel comments (llama/13651) Georgi Gerganov 2025-05-20 10:41:40 +03:00
ee3f177cba sycl : Overcoming workaround for mmap() allocation on Windows (llama/13482) Nicolò Scipione 2025-05-20 02:54:43 +02:00
0b69f74e15 Vulkan: Add f32 accumulator support to quantized mul mat to fix GLM4 32B incoherence (llama/13607) 0cc4m 2025-05-19 17:54:08 +02:00
e415db0ed7 sync : ggml Georgi Gerganov 2025-05-27 17:06:49 +03:00
2bb7694edb

docs : convert README_sycl.md to utf8 format [no ci] (#3191) Daniel Bevenius 2025-05-27 10:53:50 +02:00
450de0787e

node : enable no_prints to suppress all output (#3189) Daniel Bevenius 2025-05-27 05:51:47 +02:00
ea9f206f18

talk-llama : fix for swedish umlauts + expose model inference settings in talk-llama.cpp (#3187) matteng1 2025-05-26 07:57:39 +02:00
13d92d08ae

docs : fix VAD section heading levels (#3186) KITAITI Makoto 2025-05-23 17:38:26 +09:00
aab6976465

ci : use dynamic libopenblas.dll for window-blas (#3177) Daniel Bevenius 2025-05-23 05:48:08 +02:00
78b31ca782

server : Add k6 Load Testing Script (#3175) Sacha Arbonel 2025-05-22 10:03:04 +02:00
cbe557f9b1

docs : add VAD model download instructions [no ci] (#3180) Daniel Bevenius 2025-05-22 07:49:29 +02:00
273af4aab9

docs : replace typo "]"with ")" in README (#3179) Alpaim 2025-05-22 06:49:44 +03:00
bd1cb0c8e3

whisper : remove redundant assignments (#3178) Daniel Bevenius 2025-05-21 13:23:20 +02:00
62dc8f7d7b

whisper : update CMakeLists.txt to handle deprecated gpu Warnings (#3163) Jugal Haresh Sheth 2025-05-20 10:58:25 +01:00
2c4b904596

ruby : add GGML_SYCL_DNN option to ruby bindings (#3172) Daniel Bevenius 2025-05-19 17:59:43 +02:00
6b6cf19c65 talk-llama : sync llama.cpp Georgi Gerganov 2025-05-19 13:39:12 +03:00
05501c218d sync : ggml Georgi Gerganov 2025-05-19 13:38:44 +03:00
9da3fc27be CANN: Support MOE Model MUL_MAT_ID (llama/13042) Chenguang Li 2025-05-19 14:21:17 +08:00
2c13651e08 cmake: use the current build config for vulkan-shaders-gen (llama/13595) Gilad S. 2025-05-17 21:26:43 +03:00
13dca86c56 vulkan: move common FA code to flash_attn_base.comp (llama/13556) Jeff Bolz 2025-05-17 16:14:55 +09:00
6d61a09bc4 vulkan: use scalar FA rather than coopmat2 when N==1 (llama/13554) Jeff Bolz 2025-05-17 15:35:47 +09:00
4fedad988b metal : add FA-vec kernel for head size 64 (llama/13583) Georgi Gerganov 2025-05-16 20:32:58 +03:00
a8e17a244d sycl : fixed compilation warnings (llama/13582) Łukasz Ślusarczyk 2025-05-16 12:15:29 +02:00
0c76acd08a gguf : use ggml log system (llama/13571) Diego Devesa 2025-05-15 10:13:11 -07:00
27964db1be sycl: simplify bin_bcast_kernel (llama/13383) Atharva Dubey 2025-05-15 16:39:52 +01:00
8081e7a23d sycl: reordered Q4_K MMVQ (llama/13109) Svetlozar Georgiev 2025-05-15 16:35:44 +01:00
d807c497a4 sycl: use oneDNN for matrices multiplication (llama/12972) Łukasz Ślusarczyk 2025-05-15 16:53:41 +02:00
8e9bf548f4 arm64: optimize q6_k_q8_k kernel with i8mm (llama/13519) Yibo Cai 2025-05-15 03:53:52 +08:00
0dda27bc0b CUDA: fix crash on large batch size for quant. MoE (llama/13537) Johannes Gäßler 2025-05-14 16:41:02 +02:00
ffa4720f25 CUDA: faster Deepseek FA, add Turing support (llama/13435) Johannes Gäßler 2025-05-14 16:08:20 +02:00