-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix Ernie4.5 MoE without shared experts
python
python script changes
#14746
opened Jul 17, 2025 by
pwilkin
Loading…
[ROCm] Fix HIP version check for HIPBLAS V2 API compatibility
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#14744
opened Jul 17, 2025 by
danielholanda
Loading…
metal: SSM_SCAN performance
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#14743
opened Jul 17, 2025 by
gabe-l-hart
Loading…
Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#14741
opened Jul 17, 2025 by
ORippler
Loading…
examples : predicted output for text generation
examples
#14739
opened Jul 17, 2025 by
iamlemec
Loading…
Improve Mistral models integration with llama.cpp
python
python script changes
#14737
opened Jul 17, 2025 by
juliendenize
•
Draft
Documentation: Update build.md's Vulkan section
documentation
Improvements or additions to documentation
#14736
opened Jul 17, 2025 by
rspOverflow
Loading…
CUDA: skip masked out KQ slices in mma FA kernel
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
#14735
opened Jul 17, 2025 by
JohannesGaessler
Loading…
use max work group size for device to replace the magic number
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#14732
opened Jul 17, 2025 by
NeoZhangJianyu
Loading…
feat: Add optional prompt processing progress streaming
examples
server
#14731
opened Jul 17, 2025 by
baonudesifeizhai
Loading…
mtmd : Support jinja in libmtmd (Only for QwenVL and Qwen Omni)
examples
#14730
opened Jul 17, 2025 by
alielmorsy
Loading…
server: add prompt processing progress streaming for /completion endpoint #14685
examples
server
#14728
opened Jul 16, 2025 by
baonudesifeizhai
Loading…
CUDA: set_rows + cpy.cu refactor
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#14712
opened Jul 16, 2025 by
am17an
Loading…
vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274)
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#14707
opened Jul 16, 2025 by
Peter0x44
Loading…
Fix KleidiAI compilation errors with -DGGML_NATIVE=OFF (issue #14464)
ggml
changes relating to the ggml tensor library for machine learning
#14700
opened Jul 15, 2025 by
baonudesifeizhai
Loading…
Adding a simple-function-call example - hopefully not doing anything wrong
examples
#14682
opened Jul 14, 2025 by
klogdotwebsitenotdotcom
Loading…
kleidiai: add support for get_rows
ggml
changes relating to the ggml tensor library for machine learning
#14676
opened Jul 14, 2025 by
chaxu01
Loading…
bug fix: handle saving/loading null layers in recurrent memory
#14675
opened Jul 14, 2025 by
l3utterfly
Loading…
Add Pad Reflect 1D CUDA support
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#14659
opened Jul 13, 2025 by
YavorGIvanov
Loading…
webui : add a preset feature to the settings
examples
server
#14649
opened Jul 12, 2025 by
gabriellarson
Loading…
Add CUDA non-contiguous Unary Ops support
build
Compilation issues
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#14639
opened Jul 11, 2025 by
YavorGIvanov
Loading…
OpenCL: add changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
mul_mat_f16_f32_image
kernel
ggml
#14635
opened Jul 11, 2025 by
rmatif
Loading…
Add EXAONE 4.0 model architecture
python
python script changes
#14630
opened Jul 11, 2025 by
lgai-exaone
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.