Skip to main content

llama.cpp CVE-2024-21836

HIGH
Integer Overflow or Wraparound (CWE-190)
2024-02-26 talos-cna@cisco.com
8.8
CVSS 3.1 · NVD
Share

Severity by source

NVD PRIMARY
8.8 HIGH
AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

Primary rating from NVD · only source for this CVE.

CVSS VectorNVD

CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
Attack Vector
Network
Attack Complexity
Low
Privileges Required
None
User Interaction
Required
Scope
Unchanged
Confidentiality
High
Integrity
High
Availability
High

DescriptionCVE.org

A heap-based buffer overflow vulnerability exists in the GGUF library header.n_tensors functionality of llama.cpp Commit 18c2e17. A specially crafted .gguf file can lead to code execution. An attacker can provide a malicious file to trigger this vulnerability.

AnalysisAI

Heap-based buffer overflow in llama.cpp's GGUF library header parser (commit 18c2e17) enables code execution when a victim loads a maliciously crafted .gguf model file. The CWE-190 integer overflow in the n_tensors field corrupts heap allocations, leading to attacker-controlled memory writes. Publicly available exploit code exists, though EPSS remains low at 0.15% (35th percentile), and there is no public exploit identified as actively used per CISA KEV.

Technical ContextAI

llama.cpp is the widely-used C/C++ inference engine for LLaMA-family large language models, distributed by the ggml project (CPE cpe:2.3:a:ggml:llama.cpp). The GGUF file format is its native serialization for model weights and metadata, replacing the older GGML format. The vulnerability resides in header parsing where the n_tensors field - an attacker-controlled count - feeds into heap allocation arithmetic. CWE-190 (Integer Overflow or Wraparound) describes the root cause: oversized n_tensors values wrap during size computation, producing an undersized buffer that subsequent tensor metadata writes overflow, corrupting adjacent heap chunks.

RemediationAI

Upstream fix available (PR/commit); released patched version not independently confirmed from the provided data, so users should update llama.cpp to the latest main-branch build past commit 18c2e17 and rebuild any downstream wrappers (Python bindings, server binaries) that statically link the GGUF parser. Consult the Cisco Talos advisory referenced by talos-cna@cisco.com for the exact remediation commit. Until updated, treat .gguf files as untrusted input: only load model files from cryptographically verified sources, validate checksums against publisher signatures, and isolate inference workloads in containers or sandboxed user accounts with no access to credentials or sensitive data. Avoid auto-loading models supplied through web uploads or chat attachments, which removes a convenient delivery vector at the cost of workflow friction.

Share

CVE-2024-21836 vulnerability details – vuln.today

This site uses cookies essential for authentication and security. No tracking or analytics cookies are used. Privacy Policy