Checkpoint
Monthly
Arbitrary code execution occurs in PyTorch Lightning 2.6.0 and earlier when loading malicious checkpoint files. The LightningModule.load_from_checkpoint() method deserializes untrusted Pickle data without security restrictions, allowing attackers to execute arbitrary Python code when victims open crafted .ckpt files. EPSS score of 0.06% (19th percentile) indicates low observed exploitation probability, and no public exploit code or CISA KEV listing exists at time of analysis. Attack requires local access and user interaction (opening a malicious checkpoint), limiting remote attack scenarios to social engineering or supply chain compromise.
Arbitrary code execution via torch-checkpoint-shrink.py script in ml-engineering project allows remote attackers to execute malicious Python code by providing crafted PyTorch checkpoint files. The vulnerability stems from insecure deserialization where torch.load() processes .pt files without the weights_only=True safeguard, enabling pickle-based arbitrary object instantiation. Despite a critical CVSS 9.8 score, EPSS probability is low (0.06%, 19th percentile) and no public exploit or active exploitation is confirmed, suggesting limited real-world targeting to date. SSVC assessment indicates total technical impact with automatable exploitation potential, making this a priority for organizations using ml-engineering scripts in production environments.
Arbitrary code execution in Snorkel machine learning library (≤v0.10.0) occurs when users load malicious model checkpoint files through the Trainer.load() method. The vulnerability stems from unsafe PyTorch deserialization that processes untrusted Pickle objects without the weights_only security parameter. Attackers can embed malicious Python code in model files distributed through repositories, shared datasets, or social engineering campaigns. Despite the 8.8 CVSS score indicating critical severity, EPSS scoring at 0.06% (19th percentile) suggests very low real-world exploitation probability, and no active exploitation or public proof-of-concept has been identified at time of analysis.
The flash-attention training framework thru commit e724e2588cbe754beb97cf7c011b5e7e34119e62 (2025-13-04) contains an insecure deserialization vulnerability (CWE-502) in its checkpoint loading mechanism. The load_checkpoint() function in checkpoint.py and the checkpoint loading code in eval.py use torch.load() without enabling the security-restrictive weights_only=True parameter. This allows the deserialization of arbitrary Python objects via the pickle module. An attacker can exploit this by providing a maliciously crafted checkpoint file. When a victim loads this checkpoint during model warmstarting or evaluation, arbitrary code is executed on the victim's system.
CosyVoice thru commit 6e01309e01bc93bbeb83bdd996b1182a81aaf11e (2025-30-21) contains an insecure deserialization vulnerability (CWE-502) in its average_model.py model averaging tool. The script loads PyTorch checkpoint files (epoch_*.pt) for model averaging using torch.load() without enabling the weights_only=True security parameter. This allows the deserialization of arbitrary Python objects via the pickle module. An attacker can exploit this by providing malicious checkpoint files within a directory. When a victim uses the tool to average models from this directory, arbitrary code is executed on the victim's system.
{thread_id}/runs endpoints. Thread IDs leak through frontend URLs, server logs, and observability traces, eliminating need for enumeration. Vendor-released patch (v0.9.7) confirmed by GitHub advisory GHSA-m98r-6667-4wq7. No active exploitation or POC identified at time of analysis, though detailed reproducer exists in issue #336.
Kernel denial of service via crafted btrfs metadata allowing local attackers to trigger an unguarded BUG_ON() condition during relocation recovery at mount time. The vulnerability arises when a root item on disk contains a non-zero drop_progress with zero drop_level, an invalid state that should not exist but lacks validation on read. CVSS 5.5 reflects local attack vector and availability impact; EPSS 0.02% indicates minimal real-world exploitation likelihood.
Arbitrary code execution occurs in PyTorch Lightning 2.6.0 and earlier when loading malicious checkpoint files. The LightningModule.load_from_checkpoint() method deserializes untrusted Pickle data without security restrictions, allowing attackers to execute arbitrary Python code when victims open crafted .ckpt files. EPSS score of 0.06% (19th percentile) indicates low observed exploitation probability, and no public exploit code or CISA KEV listing exists at time of analysis. Attack requires local access and user interaction (opening a malicious checkpoint), limiting remote attack scenarios to social engineering or supply chain compromise.
Arbitrary code execution via torch-checkpoint-shrink.py script in ml-engineering project allows remote attackers to execute malicious Python code by providing crafted PyTorch checkpoint files. The vulnerability stems from insecure deserialization where torch.load() processes .pt files without the weights_only=True safeguard, enabling pickle-based arbitrary object instantiation. Despite a critical CVSS 9.8 score, EPSS probability is low (0.06%, 19th percentile) and no public exploit or active exploitation is confirmed, suggesting limited real-world targeting to date. SSVC assessment indicates total technical impact with automatable exploitation potential, making this a priority for organizations using ml-engineering scripts in production environments.
Arbitrary code execution in Snorkel machine learning library (≤v0.10.0) occurs when users load malicious model checkpoint files through the Trainer.load() method. The vulnerability stems from unsafe PyTorch deserialization that processes untrusted Pickle objects without the weights_only security parameter. Attackers can embed malicious Python code in model files distributed through repositories, shared datasets, or social engineering campaigns. Despite the 8.8 CVSS score indicating critical severity, EPSS scoring at 0.06% (19th percentile) suggests very low real-world exploitation probability, and no active exploitation or public proof-of-concept has been identified at time of analysis.
The flash-attention training framework thru commit e724e2588cbe754beb97cf7c011b5e7e34119e62 (2025-13-04) contains an insecure deserialization vulnerability (CWE-502) in its checkpoint loading mechanism. The load_checkpoint() function in checkpoint.py and the checkpoint loading code in eval.py use torch.load() without enabling the security-restrictive weights_only=True parameter. This allows the deserialization of arbitrary Python objects via the pickle module. An attacker can exploit this by providing a maliciously crafted checkpoint file. When a victim loads this checkpoint during model warmstarting or evaluation, arbitrary code is executed on the victim's system.
CosyVoice thru commit 6e01309e01bc93bbeb83bdd996b1182a81aaf11e (2025-30-21) contains an insecure deserialization vulnerability (CWE-502) in its average_model.py model averaging tool. The script loads PyTorch checkpoint files (epoch_*.pt) for model averaging using torch.load() without enabling the weights_only=True security parameter. This allows the deserialization of arbitrary Python objects via the pickle module. An attacker can exploit this by providing malicious checkpoint files within a directory. When a victim uses the tool to average models from this directory, arbitrary code is executed on the victim's system.
{thread_id}/runs endpoints. Thread IDs leak through frontend URLs, server logs, and observability traces, eliminating need for enumeration. Vendor-released patch (v0.9.7) confirmed by GitHub advisory GHSA-m98r-6667-4wq7. No active exploitation or POC identified at time of analysis, though detailed reproducer exists in issue #336.
Kernel denial of service via crafted btrfs metadata allowing local attackers to trigger an unguarded BUG_ON() condition during relocation recovery at mount time. The vulnerability arises when a root item on disk contains a non-zero drop_progress with zero drop_level, an invalid state that should not exist but lacks validation on read. CVSS 5.5 reflects local attack vector and availability impact; EPSS 0.02% indicates minimal real-world exploitation likelihood.