Is there a patch available for CVE-2026-54235?

Yes, a patch is available for CVE-2026-54235. Update the affected software to the latest version immediately.

vLLM CVE-2026-54235

Q: What is the CVSS score of CVE-2026-54235?

CVE-2026-54235 has a CVSS 4.0 base score of 6.9 (Medium). CVSS vector: CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N/E:X/CR:X/IR:X/AR:X/MAV:X/MAC:X/MAT:X/MPR:X/MUI:X/MVC:X/MVI:X/MVA:X/MSC:X/MSI:X/MSA:X/S:X/AU:X/R:X/V:X/RE:X/U:X. EPSS exploitation probability: 0.3%.

Q: Which versions of vLLM are affected by CVE-2026-54235?

Temperature parameter validation in vLLM (pip/vllm ≤ 0.23.0) can be bypassed by supplying NaN or positive Infinity as the temperature value, because Python's IEEE 754 float comparison operators…. A patch is available.

MEDIUM

Improper Validation of Specified Type of Input (CWE-1287)

2026-06-17 https://github.com/vllm-project/vllm

GHSA-7h4p-rffg-7823

Denial Of Service Python Red Hat

6.9

CVSS 4.0 · Vendor: https://github.com/vllm-project/vllm

Severity by source

Vendor (https://github.com/vllm-project/vllm) PRIMARY

6.9 MEDIUM

CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N/E:X/CR:X/IR:X/AR:X/MAV:X/MAC:X/MAT:X/MPR:X/MUI:X/MVC:X/MVI:X/MVA:X/MSC:X/MSI:X/MSA:X/S:X/AU:X/R:X/V:X/RE:X/U:X

vuln.today AI

7.5 HIGH

Network-reachable API requires no confirmed authentication; single crafted parameter crashes the worker (A:H); no confidentiality or integrity impact identified.

3.1 AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

4.0 AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:H/SC:N/SI:N/SA:N

Red Hat

6.5 MEDIUM

qualitative

Primary rating from Vendor (https://github.com/vllm-project/vllm).

CVSS VectorVendor: https://github.com/vllm-project/vllm

CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N/E:X/CR:X/IR:X/AR:X/MAV:X/MAC:X/MAT:X/MPR:X/MUI:X/MVC:X/MVI:X/MVA:X/MSC:X/MSI:X/MSA:X/S:X/AU:X/R:X/V:X/RE:X/U:X

Attack Vector

Network

Attack Complexity

Low

Privileges Required

None

User Interaction

None

Scope

Lifecycle Timeline

CVSS changed

Jun 22, 2026 - 23:22 NVD

6.9 (MEDIUM)

Source Code Evidence Fetched

Jun 18, 2026 - 01:43 vuln.today

Analysis Generated

Jun 18, 2026 - 01:43 vuln.today

Blast Radius

ecosystem impact

† from your stack dependencies † transitive graph · vuln.today resolves 4-path depth

4 pypi packages depend on vllm (3 direct, 1 indirect)

Ecosystem-wide dependent count for version 0.23.0.

DescriptionCVE.org

Summary

All temperature validation gates use comparison operators (<, >), which silently evaluate to False for NaN and for positive Infinity in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. Note: -Infinity is correctly caught.

Root Cause

sampling_params.py:384:

python

if 0 < self.temperature < _MAX_TEMP:
# NaN → False; +Inf → False

sampling_params.py:462:

python

if self.temperature < 0.0:
# NaN → False; +Inf → False
    raise VLLMValidationError(...)

No math.isnan() or math.isinf() check exists anywhere in sampling_params.py.

Python semantics (verified): float('nan') < 0.0 → False, float('inf') < 0.0 → False.

Impact

Crash of inference worker on GPU kernel execution with NaN/Inf softmax input, degrading service for all concurrent users.

Remediation

Add math.isfinite(self.temperature) check in _verify_args(). Reject non-finite float values with a 400 error.

Fix

A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/45116

AnalysisAI

Temperature parameter validation in vLLM (pip/vllm ≤ 0.23.0) can be bypassed by supplying NaN or positive Infinity as the temperature value, because Python's IEEE 754 float comparison operators silently return False for these inputs, allowing the values to propagate unchecked into GPU CUDA sampling kernels. The invalid inputs trigger undefined behavior or fatal CUDA errors that crash the inference worker process, dropping all in-flight requests and degrading service for every concurrent user sharing that worker. …

Unlock full vulnerability intelligence

Risk assessment & exploitation conditions
Attack chain visualization
Remediation with exact patch versions
Threat intelligence from 22 sources
Personal watchlist & email alerts

Continue with Google Continue with GitHub

Free forever · No credit card required

Attack ChainAIDerived

Hypothetical attack flow derived from CVE metadata

Access

Reach vLLM inference API endpoint

Delivery

Submit request with temperature=NaN or temperature=+Inf

Exploit

Bypass all comparison guards in _verify_args()

Execution

Non-finite value forwarded to CUDA sampling kernel

Persist

Fatal CUDA error crashes inference worker

Impact

All concurrent user sessions terminated

Access

Reach vLLM inference API endpoint

Delivery

Submit request with temperature=NaN or temperature=+Inf

Exploit

Bypass all comparison guards in _verify_args()

Execution

Non-finite value forwarded to CUDA sampling kernel

Persist

Fatal CUDA error crashes inference worker

Impact

All concurrent user sessions terminated

Vulnerability AssessmentAI

Exploitation	Exploitation requires the ability to submit an inference request to the vLLM API and supply an arbitrary floating-point value for the temperature parameter. … Additional conditions and limiting factors are described in the full assessment.
Risk Assessment	No official CVSS score or vector was provided in any source, so all metric assessments in this analysis are independently inferred and should be treated accordingly. … Full risk analysis with EPSS, KEV, and SSVC signal comparison available after sign-in.
Exploit Scenario	An attacker with network access to an exposed vLLM inference API endpoint submits a standard generation request with the temperature field set to NaN or positive Infinity - expressible in JSON via a non-compliant NaN literal or through a Python client that serializes float('nan') or float('inf'). The value bypasses all comparison-based guards in SamplingParams._verify_args(), is forwarded directly to the CUDA softmax sampling kernel, and produces a fatal CUDA error that terminates the inference worker process, immediately dropping all in-flight requests from every concurrent user. …
Remediation	An upstream fix is available via GitHub PR #45116 (https://github.com/vllm-project/vllm/pull/45116) and the associated commit d598d239737cfa37bcfcb98886ec3f3557fc7198, which adds math.isfinite() guards for both temperature and repetition_penalty in _verify_args(). … Detailed patch versions, workarounds, and compensating controls in full report.