Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

powerpc: add FP16 MMA path for Q4/Q8 matmul ggml changes relating to the ggml tensor library for machine learning
#19709 opened Feb 18, 2026 by shalinib-ibm Loading…
Q5_K - Block Interleaving Implementation for x86 SIMD (AVX512/AVX2) ggml changes relating to the ggml tensor library for machine learning
#19707 opened Feb 18, 2026 by Manogna-Sree Loading…
Q6_K - Block Interleaving Implementation for x86 SIMD (AVX512/AVX2) ggml changes relating to the ggml tensor library for machine learning
#19706 opened Feb 18, 2026 by Manogna-Sree Loading…
ggml-webgpu: Add unary op (SQR, SQRT, SIN, COS) support. documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning
#19700 opened Feb 18, 2026 by yomaytk Loading…
Add Mistral Voxtral Mini 4B Realtime 2602 4B streaming ASR support demo Demonstrate some concept or idea, not intended to be merged examples model Model specific python python script changes
#19698 opened Feb 17, 2026 by Acceldium Loading…
New option GGML_CUDA_FORCE_CUBLAS_COMPUTE_32F to use fp32 as compute type in cuBLAS documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#19697 opened Feb 17, 2026 by wallentri88 Loading…
server : fix V-L embedding model support
#19694 opened Feb 17, 2026 by oliveagle Loading…
test(server): add multi-image and no-image vision API tests examples python python script changes server
#19691 opened Feb 17, 2026 by jorgeutd Loading…
3 tasks done
model : Add tokenizer from LFM2.5-Audio-1.5B model Model specific python python script changes
#19687 opened Feb 17, 2026 by tdakhran Loading…
CUDA: fix kernel selection logic for tile FA ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#19686 opened Feb 17, 2026 by JohannesGaessler Loading…
Add Pylint workflow for Python code analysis devops improvements to build systems and github actions
#19671 opened Feb 16, 2026 by kerrrang9214-tech Draft
Allow partial success of seq_rm for hybrid memory
#19670 opened Feb 16, 2026 by Nekotekina Loading…
Add Kimi Linear to unified delta net model Model specific
#19668 opened Feb 16, 2026 by ymcki Loading…
models : dedup qwen35 graphs model Model specific
#19660 opened Feb 16, 2026 by ggerganov Loading…
1 of 2 tasks
avx2: compute ksigns instead of loading from table ggml changes relating to the ggml tensor library for machine learning
#19657 opened Feb 16, 2026 by dfriehs Loading…
common : fix Step-3.5-Flash format detection and thinking support testing Everything test related
#19635 opened Feb 15, 2026 by jesseposner Loading…
Vulkan Scalar Flash Attention Refactor ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#19625 opened Feb 14, 2026 by 0cc4m Loading…
metal: use mul_mv_ext for large n on non-simdgroup_mm GPUs Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
#19600 opened Feb 13, 2026 by ai-janitor Loading…
3 of 4 tasks
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.