What is the CVSS score of CVE-2024-23496?

CVE-2024-23496 has a CVSS 3.1 base score of 8.8 (High). CVSS vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H. EPSS exploitation probability: 0.2%.

Which versions of llama.cpp are affected by CVE-2024-23496?

llama.cpp, a widely-deployed open-source LLM inference framework, contains a critical remote code execution vulnerability (CVE-2024-23496, CVSS 8.8) triggered by processing malicious model files.

Is there a public PoC exploit available for CVE-2024-23496?

Yes, a public proof-of-concept exploit is available for CVE-2024-23496. Prioritise patching immediately. EPSS: 0.2%.

How to fix or mitigate CVE-2024-23496?

Within 24 hours: Identify all llama.cpp deployments and model sources across the organization; restrict model loading to pre-approved sources only and disable automatic model downloads. Within 7 days: Implement mandatory cryptographic validation (checksums, digital signatures) for all model files before loading; isolate llama.cpp services to dedicated network segments with minimal outbound access; enforce run-time application whitelisting. Within 30 days: Establish a model supply chain governance process requiring vendor attestation; continuously monitor the official llama.cpp advisory for patch availability; conduct risk assessment of previously loaded models.

llama.cpp CVE-2024-23496

HIGH

Integer Overflow or Wraparound (CWE-190)

2024-02-26 talos-cna@cisco.com

RCE Integer Overflow Buffer Overflow Llama Cpp

8.8

CVSS 3.1 · NVD

Severity by source

NVD PRIMARY

8.8 HIGH

AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

Primary rating from NVD · only source for this CVE.

CVSS VectorNVD

CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

Attack Vector

Network

Attack Complexity

Low

Privileges Required

None

User Interaction

Required

Scope

Unchanged

Confidentiality

High

Integrity

High

Availability

High

DescriptionCVE.org

A heap-based buffer overflow vulnerability exists in the GGUF library gguf_fread_str functionality of llama.cpp Commit 18c2e17. A specially crafted .gguf file can lead to code execution. An attacker can provide a malicious file to trigger this vulnerability.

AnalysisAI

Remote code execution in llama.cpp (commit 18c2e17) occurs when the GGUF library's gguf_fread_str function parses a maliciously crafted .gguf model file, triggering a heap-based buffer overflow rooted in integer overflow handling (CWE-190). Any user or service loading an untrusted GGUF model into a vulnerable llama.cpp build can be compromised, with publicly available exploit code increasing accessibility despite a low EPSS score of 0.15%.

Technical ContextAI

llama.cpp is a widely deployed C/C++ inference engine for LLaMA-family large language models, used in local LLM runners, chat frontends, and embedded AI tooling. The GGUF (GPT-Generated Unified Format) is its binary model serialization format, and gguf_fread_str is the helper that reads length-prefixed strings from .gguf files. CWE-190 (Integer Overflow or Wraparound) indicates that an attacker-controlled length field is processed without sufficient bounds checking, producing an undersized heap allocation followed by an oversized copy - a classic heap buffer overflow primitive that can corrupt adjacent heap metadata or function pointers and yield arbitrary code execution. The affected CPE cpe:2.3:a:ggml:llama.cpp confirms the ggml-maintained upstream project as the impacted component.

RemediationAI

Upstream fix available (PR/commit); released patched version not independently confirmed from the provided data - consult the Talos Intelligence advisory and the ggerganov/llama.cpp GitHub repository for the commit that follows 18c2e17 and rebuild against that or a later release. As a compensating control, only load .gguf files from trusted, signature-verified sources and reject models obtained from untrusted hubs or user uploads, which limits attack surface at the cost of model availability. Where third-party models must be supported, run llama.cpp inside a sandbox (seccomp, container with no network, or a separate low-privilege user) so that successful exploitation does not yield the calling application's privileges, accepting the operational complexity that sandboxing adds. Disabling or pre-screening GGUF file ingestion at upload boundaries (size/length sanity checks) provides partial mitigation but is not a substitute for patching the parser.

CVE-2024-23496 vulnerability details – vuln.today

Back

llama.cpp CVE-2024-23496

Severity by source

CVSS VectorNVD

DescriptionCVE.org

AnalysisAI

Technical ContextAI

RemediationAI

Share

External POC / Exploit Code