What is the CVSS score of CVE-2025-52566?

CVE-2025-52566 has a CVSS 3.1 base score of 8.6 (High). CVSS vector: CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H. EPSS exploitation probability: 0.1%.

Which versions of Python are affected by CVE-2025-52566?

CVE-2025-52566 is a high-severity heap overflow vulnerability in llama.cpp that could allow attackers to crash or potentially execute code through malicious text input to LLM inference systems.. A patch is available.

Is there a public PoC exploit available for CVE-2025-52566?

Yes, a public proof-of-concept exploit is available for CVE-2025-52566. Prioritise patching immediately. EPSS: 0.1%.

How to fix or mitigate CVE-2025-52566?

Within 24 hours: Inventory all systems running llama.cpp and assess exposure in production environments. Within 7 days: Apply vendor patch version b5721 or later to all affected systems, prioritizing production inference endpoints. Within 30 days: Conduct post-patch validation testing and document patching completion in vulnerability management systems.

Python CVE-2025-52566

EUVD-2025-19074 HIGH

Buffer Overflow (CWE-119)

2025-06-24 security-advisories@github.com

Buffer Overflow Heap Overflow Integer Overflow Python Llama.Cpp Suse

8.6

CVSS 3.1 · GitHub Advisory

Severity by source

GitHub Advisory PRIMARY

8.6 HIGH

AV:L/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H

Ubuntu

MEDIUM

qualitative

SUSE

HIGH

qualitative

Primary rating from GitHub Advisory.

CVSS VectorGitHub Advisory

CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H

Attack Vector

Local

Attack Complexity

Low

Privileges Required

None

User Interaction

Required

Scope

Changed

Confidentiality

High

Integrity

High

Availability

High

Lifecycle Timeline

EUVD ID Assigned

Mar 15, 2026 - 22:36 euvd

EUVD-2025-19074

Analysis Generated

Mar 15, 2026 - 22:36 vuln.today

Patch released

Mar 15, 2026 - 22:36 nvd

Patch available

PoC Detected

Aug 27, 2025 - 14:01 vuln.today

Public exploit code

CVE Published

Jun 24, 2025 - 04:15 nvd

HIGH 8.6

DescriptionGitHub Advisory

llama.cpp is an inference of several LLM models in C/C++. Prior to version b5721, there is a signed vs. unsigned integer overflow in llama.cpp's tokenizer implementation (llama_vocab::tokenize) (src/llama-vocab.cpp:3036) resulting in unintended behavior in tokens copying size comparison. Allowing heap-overflowing llama.cpp inferencing engine with carefully manipulated text input during tokenization process. This issue has been patched in version b5721.

AnalysisAI

CVE-2025-52566 is a signed vs. unsigned integer overflow vulnerability in llama.cpp's tokenizer (llama_vocab::tokenize function) that enables heap buffer overflow during text tokenization. This affects all versions of llama.cpp prior to b5721, and attackers can trigger the vulnerability with specially crafted text input during the inference process, potentially achieving code execution with high confidentiality, integrity, and availability impact. The vulnerability requires local access and user interaction but has a high CVSS score of 8.6; KEV status and active exploitation data are not currently available, but the patch exists in version b5721.

Technical ContextAI

llama.cpp is a C/C++ implementation for inference of Large Language Models (LLMs), designed to run efficiently on consumer hardware. The vulnerability exists in src/llama-vocab.cpp at line 3036 within the tokenizer implementation. The root cause is a signed vs. unsigned integer overflow (CWE-119: Improper Restriction of Operations within the Bounds of a Memory Buffer) occurring during token copying size comparisons. When tokenizing user-supplied text, the vulnerable code miscalculates buffer sizes due to type confusion between signed and unsigned integers, allowing writes beyond allocated heap memory. This is a classic buffer overflow scenario where the tokenizer processes input tokens and fails to properly validate allocation boundaries before copying token data. The vulnerable code path is triggered during normal LLM inference when processing text tokens, making it reachable from any application using llama.cpp for inference tasks.

RemediationAI

Immediate remediation: Upgrade llama.cpp to version b5721 or later, which contains the patch for the signed vs. unsigned integer overflow. For development environments: pull the latest code from the llama.cpp main branch (commit b5721 or later) and rebuild. For production deployments: (1) Update to patched version immediately; (2) If immediate patching is not feasible, implement input validation/sanitization at the application layer before passing text to llama.cpp tokenizer (e.g., limit input size, reject unusual character sequences, validate token counts); (3) Reduce attack surface by restricting who can submit inference requests to llama.cpp services; (4) Run llama.cpp processes with minimal privileges and in sandboxed/containerized environments to limit impact of potential code execution. Long-term: Monitor the official llama.cpp repository (github.com/ggerganov/llama.cpp) for security updates and establish a patching cadence for AI/ML dependencies.