Skip to main content

vLLM CVE-2026-54235

MEDIUM
Improper Validation of Specified Type of Input (CWE-1287)
2026-06-17 https://github.com/vllm-project/vllm GHSA-7h4p-rffg-7823
6.9
CVSS 4.0 · Vendor: https://github.com/vllm-project/vllm
Share

Severity by source

Vendor (https://github.com/vllm-project/vllm) PRIMARY
6.9 MEDIUM
CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N/E:X/CR:X/IR:X/AR:X/MAV:X/MAC:X/MAT:X/MPR:X/MUI:X/MVC:X/MVI:X/MVA:X/MSC:X/MSI:X/MSA:X/S:X/AU:X/R:X/V:X/RE:X/U:X
vuln.today AI
7.5 HIGH

Network-reachable API requires no confirmed authentication; single crafted parameter crashes the worker (A:H); no confidentiality or integrity impact identified.

3.1 AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
4.0 AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:H/SC:N/SI:N/SA:N
Red Hat
6.5 MEDIUM
qualitative

Primary rating from Vendor (https://github.com/vllm-project/vllm).

CVSS VectorVendor: https://github.com/vllm-project/vllm

CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N/E:X/CR:X/IR:X/AR:X/MAV:X/MAC:X/MAT:X/MPR:X/MUI:X/MVC:X/MVI:X/MVA:X/MSC:X/MSI:X/MSA:X/S:X/AU:X/R:X/V:X/RE:X/U:X
Attack Vector
Network
Attack Complexity
Low
Privileges Required
None
User Interaction
None
Scope
X

Lifecycle Timeline

3
CVSS changed
Jun 22, 2026 - 23:22 NVD
6.9 (MEDIUM)
Source Code Evidence Fetched
Jun 18, 2026 - 01:43 vuln.today
Analysis Generated
Jun 18, 2026 - 01:43 vuln.today

Blast Radius

ecosystem impact
† from your stack dependencies † transitive graph · vuln.today resolves 4-path depth
  • 4 pypi packages depend on vllm (3 direct, 1 indirect)

Ecosystem-wide dependent count for version 0.23.0.

DescriptionCVE.org

Summary

All temperature validation gates use comparison operators (<, >), which silently evaluate to False for NaN and for positive Infinity in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. Note: -Infinity is correctly caught.

Root Cause

sampling_params.py:384:

python
if 0 < self.temperature < _MAX_TEMP:
# NaN → False; +Inf → False

sampling_params.py:462:

python
if self.temperature < 0.0:
# NaN → False; +Inf → False
    raise VLLMValidationError(...)

No math.isnan() or math.isinf() check exists anywhere in sampling_params.py.

Python semantics (verified): float('nan') < 0.0False, float('inf') < 0.0False.

Impact

Crash of inference worker on GPU kernel execution with NaN/Inf softmax input, degrading service for all concurrent users.

Remediation

Add math.isfinite(self.temperature) check in _verify_args(). Reject non-finite float values with a 400 error.

Fix

A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/45116

AnalysisAI

Temperature parameter validation in vLLM (pip/vllm ≤ 0.23.0) can be bypassed by supplying NaN or positive Infinity as the temperature value, because Python's IEEE 754 float comparison operators silently return False for these inputs, allowing the values to propagate unchecked into GPU CUDA sampling kernels. The invalid inputs trigger undefined behavior or fatal CUDA errors that crash the inference worker process, dropping all in-flight requests and degrading service for every concurrent user sharing that worker. …

Unlock full vulnerability intelligence

  • Risk assessment & exploitation conditions
  • Attack chain visualization
  • Remediation with exact patch versions
  • Threat intelligence from 22 sources
  • Personal watchlist & email alerts

Free forever · No credit card required

Attack ChainAIDerived

Hypothetical attack flow derived from CVE metadata

Access
Reach vLLM inference API endpoint
Delivery
Submit request with temperature=NaN or temperature=+Inf
Exploit
Bypass all comparison guards in _verify_args()
Execution
Non-finite value forwarded to CUDA sampling kernel
Persist
Fatal CUDA error crashes inference worker
Impact
All concurrent user sessions terminated

Vulnerability AssessmentAI

Exploitation Exploitation requires the ability to submit an inference request to the vLLM API and supply an arbitrary floating-point value for the temperature parameter. … Additional conditions and limiting factors are described in the full assessment.
Risk Assessment No official CVSS score or vector was provided in any source, so all metric assessments in this analysis are independently inferred and should be treated accordingly. … Full risk analysis with EPSS, KEV, and SSVC signal comparison available after sign-in.
Exploit Scenario An attacker with network access to an exposed vLLM inference API endpoint submits a standard generation request with the temperature field set to NaN or positive Infinity - expressible in JSON via a non-compliant NaN literal or through a Python client that serializes float('nan') or float('inf'). The value bypasses all comparison-based guards in SamplingParams._verify_args(), is forwarded directly to the CUDA softmax sampling kernel, and produces a fatal CUDA error that terminates the inference worker process, immediately dropping all in-flight requests from every concurrent user. …
Remediation An upstream fix is available via GitHub PR #45116 (https://github.com/vllm-project/vllm/pull/45116) and the associated commit d598d239737cfa37bcfcb98886ec3f3557fc7198, which adds math.isfinite() guards for both temperature and repetition_penalty in _verify_args(). … Detailed patch versions, workarounds, and compensating controls in full report.

Threat intelligence, references, and detailed analysis are available after sign-in.

More in Python

View all
CVE-2025-24016 CRITICAL POC
9.9 Feb 10

Wazuh SIEM platform versions 4.4.0 through 4.9.0 contain an unsafe deserialization vulnerability in the DistributedAPI t

CVE-2025-27520 CRITICAL POC
9.8 Apr 04

BentoML version 1.4.2 and earlier contains an unauthenticated remote code execution vulnerability through insecure deser

CVE-2025-2945 CRITICAL POC
9.9 Apr 03

pgAdmin 4 contains critical remote code execution vulnerabilities in the Query Tool download and Cloud Deployment endpoi

CVE-2025-32375 CRITICAL POC
9.8 Apr 09

BentoML is a Python library for building online serving systems optimized for AI apps and model inference. Rated critica

CVE-2024-21644 HIGH POC
7.5 Jan 08

pyLoad download manager version prior to 0.5.0b3.dev77 exposes the Flask SECRET_KEY through an unauthenticated endpoint.

CVE-2026-39987 CRITICAL POC
9.3 Apr 08

Unauthenticated remote code execution in Marimo ≤0.20.4 allows attackers to execute arbitrary system commands via the `/

CVE-2024-21645 MEDIUM POC
5.3 Jan 08

pyLoad is the free and open-source Download Manager written in pure Python. Rated medium severity (CVSS 5.3), this vulne

CVE-2026-33017 CRITICAL POC
9.3 Mar 17

Langflow (a visual LLM pipeline builder) contains a critical unauthenticated code execution vulnerability (CVE-2026-3301

CVE-2026-27966 CRITICAL POC
9.8 Feb 26

Code injection in Langflow CSV Agent node before 1.8.0. The node hardcodes allow_dangerous_code=True, enabling arbitrary

CVE-2025-0868 CRITICAL POC
9.3 Feb 20

A vulnerability, that could result in Remote Code Execution (RCE), has been found in DocsGPT. Rated critical severity (C

CVE-2026-41264 CRITICAL POC
9.2 Apr 21

## Abstract Trend Micro's Zero Day Initiative has identified a vulnerability affecting FlowiseAI Flowise. ## Vulnerabi

CVE-2025-1550 CRITICAL POC
9.8 Mar 11

Keras Model.load_model can execute arbitrary code even with safe_mode=True by manipulating the config.json inside a .ker

Vendor StatusVendor

Share

CVE-2026-54235 vulnerability details – vuln.today

This site uses cookies essential for authentication and security. No tracking or analytics cookies are used. Privacy Policy