SafePickle: Robust and Generic ML Detection of Malicious Pickle-based ML Models

πŸ“… 2026-02-23
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the critical security risk of remote code execution (RCE) inherent in Python’s pickle format when distributing machine learning models. Existing detection approaches suffer from limited generalizability due to their reliance on specific libraries or complex instrumentation strategies. To overcome these limitations, the authors propose a lightweight, library-agnostic method for detecting malicious pickle files that requires neither prior knowledge of benign models nor runtime code instrumentation. By performing static analysis to extract structural and semantic features from pickle bytecode and combining supervised with unsupervised learning for classification, the approach achieves superior performance across multiple benchmark and custom datasets, attaining an F1-score of 90.01%. Notably, it successfully identifies all nine advanced evasion attack samples, marking the first demonstration of a general-purpose, highly robust solution for malicious payload detection in pickle objects.

Technology Category

Application Category

πŸ“ Abstract
Model repositories such as Hugging Face increasingly distribute machine learning artifacts serialized with Python's pickle format, exposing users to remote code execution (RCE) risks during model loading. Recent defenses, such as PickleBall, rely on per-library policy synthesis that requires complex system setups and verified benign models, which limits scalability and generalization. In this work, we propose a lightweight, machine-learning-based scanner that detects malicious Pickle-based files without policy generation or code instrumentation. Our approach statically extracts structural and semantic features from Pickle bytecode and applies supervised and unsupervised models to classify files as benign or malicious. We construct and release a labeled dataset of 727 Pickle-based files from Hugging Face and evaluate our models on four datasets: our own, PickleBall (out-of-distribution), Hide-and-Seek (9 advanced evasive malicious models), and synthetic joblib files. Our method achieves 90.01% F1-score compared with 7.23%-62.75% achieved by the SOTA scanners (Modelscan, Fickling, ClamAV, VirusTotal) on our dataset. Furthermore, on the PickleBall data (OOD), it achieves 81.22% F1-score compared with 76.09% achieved by the PickleBall method, while remaining fully library-agnostic. Finally, we show that our method is the only one to correctly parse and classify 9/9 evasive Hide-and-Seek malicious models specially crafted to evade scanners. This demonstrates that data-driven detection can effectively and generically mitigate Pickle-based model file attacks.
Problem

Research questions and friction points this paper is trying to address.

Pickle
malicious model detection
remote code execution
model security
machine learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pickle security
malicious model detection
static bytecode analysis
library-agnostic scanning
machine learning security
πŸ”Ž Similar Papers
No similar papers found.