Commit fa966b9

authored and

committed

Vulkan k-quant mmq and ggml-backend offload functionality (llama/6155)

* Fix Vulkan no kv offload incoherence * Add k-quant mul mat mat shaders * Rework working buffer allocation, reduces vram use noticeably Clean up cpu assist code, replaced with ggml-backend offload function * Default to all dedicated GPUs * Add fallback for integrated GPUs if no dedicated GPUs are found * Add debug info which device is allocating memory * Fix Intel dequant issue Fix validation issue * Fix Vulkan GGML_OP_GET_ROWS implementation * Clean up merge artifacts * Remove Vulkan warning

1 parent b83a9fc commit fa966b9Copy full SHA for fa966b9

3 files changed

+328

-353

lines changed

ggml-vulkan.cpp
ggml-vulkan.h
ggml.c

3 files changed

+328

-353

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit fa966b9

3 files changed

3 files changed

File tree

3 files changed

3 files changed

0 commit comments