ggml-org / llama.cpp Public

Notifications You must be signed in to change notification settings
Fork 18.4k
Star 111k

Code
Issues 676
Pull requests 1k
Discussions
Actions
Projects
Wiki
Security and quality 13
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security and quality
Insights

Pull requests: ggml-org/llama.cpp

Labels 96 Milestones 0

New pull request New

1,008 Open 10,725 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

fix: pass ctx (not &ctx) to HAP_power_set in ggml-hexagon HMX power-up calls

#23367 opened May 19, 2026 by mvanhorn

Loading…

fix: guard cpy_scalar_transpose against strided dst in ggml-cuda cpy dispatch

#23366 opened May 19, 2026 by mvanhorn

Loading…

Add trace (level 5) to -lv help text and emit 'T' prefix for trace log lines

#23365 opened May 19, 2026 by mvanhorn

Loading…

fix: restore inter-iteration __syncthreads in ggml-cuda ssm_scan_f32 to prevent smemB/smemC race

#23364 opened May 19, 2026 by mvanhorn

Loading…

Align openvino and cann Dockerfiles with other backends by adding GGML_BACKEND_DL and GGML_CPU_ALL_VARIANTS flags

#23363 opened May 19, 2026 by mvanhorn

Loading…

Use password input type for WebUI API key field to prevent browser autofill in clear text

#23362 opened May 19, 2026 by mvanhorn

Loading…

ggml: fix SCHED_DEBUG output when logging with timestamps

#23360 opened May 19, 2026 by yiding

Loading…

ggml-cpu: Optimized Arm NEON cpu q1_0 dot (with plain/DP/I8MM)

#23358 opened May 19, 2026 by pl752 Contributor

Loading…

ui: fix stop/continue during an agentic loop

#23356 opened May 19, 2026 by ServeurpersoCom Contributor

Loading…

Add keepalive messages to HTTP streaming connections

#23355 opened May 19, 2026 by komadori82 • Draft

metal : optimize pad

#23354 opened May 19, 2026 by ggerganov Member

Loading…

cmake/ui : refactor the build

#23352 opened May 19, 2026 by aldehir Contributor • Draft

ngram-map : use quality-weighted scoring and EMA acceptance tracking

#23350 opened May 19, 2026 by neerajdad123-byte

Loading…

1 task done

ggml-cuda: tune RDNA3 Q6_K MMVQ nwarps

#23349 opened May 19, 2026 by ravel7524

Loading…

Support for DeepseekV32ForCausalLM with generic DeepSeek Sparse Attention (DSA) implementation ggml

changes relating to the ggml tensor library for machine learning

model

Model specific

python

python script changes

testing

Everything test related

#23346 opened May 19, 2026 by fairydreaming Collaborator

Loading…

mtmd : fix DeepSeek-OCR image processing, server issue and tests examples python

python script changes

testing

Everything test related

#23345 opened May 19, 2026 by sfallah Contributor

Loading…

vocab : add domyn-small Metaspace BPE pre-tokenizer type

#23343 opened May 19, 2026 by martin-cimmino

Loading…

server : add --slot-context for per-slot context size configuration examples server

#23340 opened May 19, 2026 by AlexRednic • Draft

tests : move save-load-state from examples to tests devops

improvements to build systems and github actions

examples testing

Everything test related

#23336 opened May 19, 2026 by ggerganov Member • Draft

1 task done

CUDA: add STQ1_0 dequantization kernel examples ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

python

python script changes

#23332 opened May 19, 2026 by antrc2

Loading…

ui: Refactor isMobile as reactive value in viewport store examples server/ui

#23330 opened May 19, 2026 by allozaur Contributor

Loading…

hunyuan-vl : merge HunyuanOCR into HunyuanVL and fix OCR vision precision examples model

Model specific

python

python script changes

server

#23329 opened May 19, 2026 by wendadawen

Loading…

server: expose speculative decoding counters in Prometheus metrics examples server

#23328 opened May 19, 2026 by boxcee • Draft

opencl: refactor backend initilization ggml

changes relating to the ggml tensor library for machine learning

OpenCL

Issues specific to the OpenCL backend

#23318 opened May 19, 2026 by lhez Contributor

Loading…

OpenCL: OP_GATED_DELTA_NET ggml

changes relating to the ggml tensor library for machine learning

OpenCL

Issues specific to the OpenCL backend

#23312 opened May 19, 2026 by ymcki Contributor

Loading…

Previous 1 2 3 4 5 … 40 41 Next

Previous Next

ProTip! Filter pull requests by the default branch with base:master.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!