vLLM CVE-2026-41523
HIGHSeverity by source
AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H
Attacker is a remote unauthenticated model publisher (AV:N/PR:N) but victim must load the model and run vLLM under -O (UI:R, AC:H); successful exploit yields full process compromise (C:H/I:H/A:H).
Primary rating from GitHub Advisory.
CVSS VectorGitHub Advisory
CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H
Lifecycle Timeline
3Blast Radius
ecosystem impact- 1 pypi packages depend on vllm (1 direct, 0 indirect)
Ecosystem-wide dependent count for version 0.22.0.
DescriptionGitHub Advisory
Summary
An assert-based security check in vLLM's activation function loading allows any unauthenticated attacker to achieve arbitrary code execution on the server by publishing a malicious HuggingFace model, when vLLM runs in Python optimized mode (python -O or PYTHONOPTIMIZE=1).
Details
vLLM uses an assert statement at vllm/model_executor/layers/pooler/activations.py:48 as its sole security control to restrict which activation functions can be loaded from a HuggingFace model's config.json:
# vllm/model_executor/layers/pooler/activations.py:35-53
function_name: str | None = None
if (
hasattr(config, "sentence_transformers")
and "activation_fn" in config.sentence_transformers
):
function_name = config.sentence_transformers["activation_fn"]
elif (
hasattr(config, "sbert_ce_default_activation_function")
and config.sbert_ce_default_activation_function is not None
):
function_name = config.sbert_ce_default_activation_function
if function_name is not None:
assert function_name.startswith("torch.nn.modules."), (
"Loading of activation functions is restricted to "
"torch.nn.modules for security reasons"
)
fn = resolve_obj_by_qualname(function_name)()Python's assert statements are stripped at compile time when running in optimized mode (python -O or PYTHONOPTIMIZE=1). When the assert is absent, the attacker-controlled function_name from the model's config.json is passed directly to resolve_obj_by_qualname() - an unrestricted import gadget:
def resolve_obj_by_qualname(qualname: str) -> Any:
module_name, obj_name = qualname.rsplit(".", 1)
module = importlib.import_module(module_name)
return getattr(module, obj_name)This is the same vulnerability class as CVE-2017-1000433 (pysaml2 assert-based auth bypass), flagged by Bandit B101 and Ruff S101, and the reason Django proactively replaced all assert-based security checks (ticket #32508).
Attacker-controlled input sources:
config.sentence_transformers["activation_fn"](line 40)config.sbert_ce_default_activation_function(line 45)
Affected call sites - get_act_fn() is called via resolve_classifier_act_fn() from:
vllm/model_executor/layers/pooler/seqwise/poolers.py:122- SequencePoolervllm/model_executor/layers/pooler/tokwise/poolers.py:130- TokenPooler
Broader systemic risk: resolve_obj_by_qualname is called from ~20 locations across the codebase with no validation of its own. Any future caller feeding user-controlled input to it without validation creates the same vulnerability class.
Suggested fix: Replace the assert with an explicit conditional raise:
if not function_name.startswith("torch.nn.modules."):
raise ValueError(
"Loading of activation functions is restricted to "
"torch.nn.modules for security reasons"
)Impact
Arbitrary code execution. A malicious model author publishes a HuggingFace model with a crafted config.json. When a victim loads this model with vLLM running under python -O or PYTHONOPTIMIZE=1, arbitrary code executes during model initialization with the privileges of the vLLM process.
The attack requires:
- Victim loads a malicious model from HuggingFace (user interaction)
- vLLM runs under
python -OorPYTHONOPTIMIZE=1(documented in production use) - Model uses a cross-encoder architecture (e.g. BERT or RoBERTa with sequence classification)
Coordinated disclosure note: This vulnerability was also reported via huntr.com on April 2, 2026 (https://huntr.com/bounties/dcb05b04-e625-41e7-adbc-bbae0cc2d64c). A GitHub Security Advisory was also filed because it is vLLM's stated preferred disclosure channel per SECURITY.md.
Fix
A fix for this was introduced in this commit: https://github.com/vllm-project/vllm/commit/b3c7ffcab82c2439726f8cb213800f6f38c023d3
Articles & Coverage 1
AnalysisAI
Arbitrary code execution in vLLM versions prior to 0.22.0 allows remote unauthenticated attackers to run code on the inference server by publishing a malicious HuggingFace model, when vLLM is launched in Python optimized mode (python -O or PYTHONOPTIMIZE=1). The sole guardrail restricting which activation function classes can be loaded from a model's config.json is implemented with a Python assert, which is stripped at compile time under -O, leaving an unrestricted import gadget directly fed by attacker-controlled data. …
Unlock full vulnerability intelligence
- Risk assessment & exploitation conditions
- Attack chain visualization
- Remediation with exact patch versions
- Threat intelligence from 22 sources
- Personal watchlist & email alerts
Free forever · No credit card required
Attack ChainAIDerived
Hypothetical attack flow derived from CVE metadata
Vulnerability AssessmentAI
| Exploitation | Exploitation requires three concrete preconditions, all explicitly stated in the advisory: (1) the vLLM Python process must be launched with optimization enabled - either 'python -O' or PYTHONOPTIMIZE=1 in the environment - so the assert-based allowlist is removed at compile time; (2) the victim must load an attacker-controlled HuggingFace model whose config.json populates either sentence_transformers.activation_fn or sbert_ce_default_activation_function (user-interaction step, matching CVSS UI:R); and (3) the model must use a cross-encoder architecture (e.g. … Additional conditions and limiting factors are described in the full assessment. |
| Risk Assessment | The vendor-issued CVSS 3.1 score of 7.5 with vector AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H captures the trade-off well: the impact is full arbitrary code execution as the vLLM process (C:H/I:H/A:H), but AC:H, UI:R and the requirement that the target process be started with -O/PYTHONOPTIMIZE=1 plus loading a cross-encoder architecture mean exploitation is not a one-shot internet-wide bug. … Full risk analysis with EPSS, KEV, and SSVC signal comparison available after sign-in. |
| Exploit Scenario | An attacker publishes a HuggingFace cross-encoder model whose config.json sets sentence_transformers.activation_fn (or sbert_ce_default_activation_function) to a fully qualified Python name outside torch.nn.modules - for example, a module whose import side effect spawns a reverse shell. A victim operator running vLLM under python -O (or PYTHONOPTIMIZE=1) loads the model by name; during pooler initialization the stripped assert no longer enforces the allowlist, and resolve_obj_by_qualname() imports the attacker's module and calls getattr() on it, executing arbitrary code in the vLLM process. … |
| Remediation | Vendor-released patch: upgrade vLLM to 0.22.0 or later, which replaces the assert with an explicit 'if not function_name.startswith("torch.nn.modules."): raise ValueError(...)' in vllm/model_executor/layers/pooler/activations.py (commit b3c7ffcab82c2439726f8cb213800f6f38c023d3, advisory https://github.com/vllm-project/vllm/security/advisories/GHSA-q8gq-377p-jq3r). … Detailed patch versions, workarounds, and compensating controls in full report. |
Recommended ActionAI
Within 24 hours: Identify all vLLM deployments and determine if Python optimized mode is enabled (check for -O flag or PYTHONOPTIMIZE=1 environment variable); review recent model downloads from external sources. …
Sign in for detailed remediation steps and compensating controls.
Threat intelligence, references, and detailed analysis are available after sign-in.
More from same product – last 7 days
Unauthenticated remote attackers can invoke MCP tool handlers and exfiltrate the operator's long-lived Meta Graph API ac
Remote code execution in Splunk Enterprise, Splunk Cloud Platform, and the Splunk Secure Gateway app allows a low-privil
Unauthenticated remote code execution in Crawl4AI versions <= 0.8.6 allows attackers to escape the AST-based sandbox in
Remote code execution in Langflow versions through 1.9.1 allows unauthenticated attackers to execute arbitrary Python co
Authenticated remote code execution in ChromaDB Python project versions 0.4.17 and later enables attackers holding the U
Share
External POC / Exploit Code
Leaving vuln.today
GHSA-q8gq-377p-jq3r