Osmosis Distillation: Model Hijacking with the Fewest Samples

📅 2026-03-05

📈 Citations: 0

✨ Influential: 0

career value

231K/year

🤖 AI Summary

This work exposes a critical security vulnerability in transfer learning when employing third-party synthetic datasets: an adversary can inject only a minimal number of malicious samples to hijack the model into executing hidden malicious behaviors without degrading its performance on the original task. To this end, we introduce Osmosis Distillation, a novel attack that uniquely integrates dataset distillation with adversarial poisoning to construct stealthy, architecture-agnostic poisoned samples. Experimental results demonstrate that our method achieves high attack success rates with extremely low poisoning ratios while preserving the model’s utility on the target task, thereby highlighting significant yet previously underappreciated security risks associated with the use of synthetic data in transfer learning scenarios.

Technology Category

Application Category

📝 Abstract

Transfer learning is devised to leverage knowledge from pre-trained models to solve new tasks with limited data and computational resources. Meanwhile, dataset distillation has emerged to synthesize a compact dataset that preserves critical information from the original large dataset. Therefore, a combination of transfer learning and dataset distillation offers promising performance in evaluations. However, a non-negligible security threat remains undiscovered in transfer learning using synthetic datasets generated by dataset distillation methods, where an adversary can perform a model hijacking attack with only a few poisoned samples in the synthetic dataset. To reveal this threat, we propose Osmosis Distillation (OD) attack, a novel model hijacking strategy that targets deep learning models using the fewest samples. Comprehensive evaluations on various datasets demonstrate that the OD attack attains high attack success rates in hidden tasks while preserving high model utility in original tasks. Furthermore, the distilled osmosis set enables model hijacking across diverse model architectures, allowing model hijacking in transfer learning with considerable attack performance and model utility. We argue that awareness of using third-party synthetic datasets in transfer learning must be raised.

Problem

Research questions and friction points this paper is trying to address.

model hijacking

transfer learning

dataset distillation

synthetic dataset

security threat

Innovation

Methods, ideas, or system contributions that make the work stand out.

Osmosis Distillation

model hijacking

dataset distillation