What is the CVSS score of CVE-2025-33255?

CVE-2025-33255 has a CVSS 3.1 base score of 7.5 (High). CVSS vector: CVSS:3.1/AV:L/AC:H/PR:H/UI:N/S:C/C:H/I:H/A:H. EPSS exploitation probability: 0.1%.

Which versions of NVIDIA TensorRT-LLM are affected by CVE-2025-33255?

NVIDIA TensorRT-LLM, a machine learning library used in production AI systems, contains a code execution vulnerability (CVE-2025-33255) that high-privileged local users could exploit to compromise…

How to fix or mitigate CVE-2025-33255?

Within 24 hours: Identify all production systems running TensorRT-LLM and document which service accounts and users have local access to those systems. Within 7 days: Restrict local system access to essential personnel only, disable unnecessary local accounts on TensorRT-LLM servers, and isolate MPI server ports from untrusted networks using firewall rules. Within 30 days: Monitor NVIDIA security bulletins for patch release, prepare a tested upgrade procedure to patched versions, and schedule deployment of patches once available.

NVIDIA TensorRT-LLM CVE-2025-33255

EUVD-2025-209903 HIGH

Deserialization of Untrusted Data (CWE-502)

2026-05-20 nvidia

GHSA-gvr5-23jj-pf9p

Information Disclosure Deserialization RCE Denial Of Service Nvidia

7.5

CVSS 3.1 · NVD

Severity by source

NVD PRIMARY

7.5 HIGH

AV:L/AC:H/PR:H/UI:N/S:C/C:H/I:H/A:H

Primary rating from NVD · only source for this CVE.

CVSS VectorNVD

CVSS:3.1/AV:L/AC:H/PR:H/UI:N/S:C/C:H/I:H/A:H

Attack Vector

Local

Attack Complexity

High

Privileges Required

High

User Interaction

None

Scope

Changed

Confidentiality

High

Integrity

High

Availability

High

Lifecycle Timeline

Analysis Generated

May 20, 2026 - 04:00 vuln.today

DescriptionCVE.org

NVIDIA TRT-LLM for any platform contains a vulnerability in MPI server, where an attacker could cause an unsafe deserialization. A successful exploit of this vulnerability might lead to code execution, denial of service, data tampering, and information disclosure.

AnalysisAI

Unsafe deserialization in NVIDIA TensorRT-LLM's MPI server component allows a high-privileged local attacker to achieve code execution, denial of service, data tampering, or information disclosure on systems running the affected library. The CVSS 7.5 score reflects high impact but constrained exploitability (AV:L/AC:H/PR:H), and no public exploit identified at time of analysis. Scope change (S:C) indicates compromise can extend beyond the vulnerable component to impact other resources on the host.

Technical ContextAI

TensorRT-LLM (TRT-LLM) is NVIDIA's open-source library for optimizing large language model inference on NVIDIA GPUs, commonly deployed in multi-GPU and multi-node configurations using MPI (Message Passing Interface) for distributed workload coordination. The vulnerability resides in the MPI server component, which evidently accepts serialized objects from peer processes and reconstructs them without sufficient validation. The root cause is CWE-502 (Deserialization of Untrusted Data) - a class of flaw where attacker-influenced serialized payloads are reconstructed into in-memory objects, allowing gadget chains or type confusion to drive arbitrary code paths during the deserialization process itself, before any application-level authorization checks run.

RemediationAI

Patch available per vendor advisory - consult NVIDIA's official security bulletin at https://nvidia.custhelp.com/app/answers/detail/a_id/5805 for the exact fixed TensorRT-LLM version and upgrade to that release. Until upgrading is feasible, compensating controls include restricting MPI traffic to a dedicated isolated network segment with strict firewall rules (block external access to MPI ports, typically in the 1024-65535 ephemeral range used by Open MPI/MPICH), enforcing host-level authentication and authorization for any user able to join the MPI communicator, and avoiding multi-tenant TRT-LLM deployments where untrusted workloads share the same MPI fabric - the trade-off being reduced flexibility for shared GPU cluster usage. Additionally, monitor MPI server processes for unexpected child processes or outbound connections as a detective control, accepting the limitation that exploitation may occur in-memory without spawning new processes.