You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Vulkan k-quant mmq and ggml-backend offload functionality (llama/6155)
* Fix Vulkan no kv offload incoherence
* Add k-quant mul mat mat shaders
* Rework working buffer allocation, reduces vram use noticeably
Clean up cpu assist code, replaced with ggml-backend offload function
* Default to all dedicated GPUs
* Add fallback for integrated GPUs if no dedicated GPUs are found
* Add debug info which device is allocating memory
* Fix Intel dequant issue
Fix validation issue
* Fix Vulkan GGML_OP_GET_ROWS implementation
* Clean up merge artifacts
* Remove Vulkan warning
0 commit comments