Tackle CSM in JPEG Steganalysis with Data Adaptation

📅 2026-05-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

210K/year
🤖 AI Summary
This work addresses the cover source mismatch (CSM) problem in JPEG steganalysis, which arises from unknown image processing pipelines—particularly challenging in realistic scenarios with only limited unlabeled data, unknown processing operations, and unknown embedding rates. To tackle this, the paper proposes TADA, a novel framework that, for the first time, operates effectively under an unsupervised few-shot setting. TADA constructs a processing pipeline simulator by jointly leveraging residual covariance alignment, residual distribution matching, and an ℓ² image fidelity loss to accurately model and adapt to unknown processing operations. Extensive experiments demonstrate that TADA significantly enhances robustness against CSM across multiple synthetic and real-world datasets, consistently outperforming both holistic and atomic baseline methods.
📝 Abstract
Steganalysis models excel on benchmark datasets but struggle in the wild when analyzed images are produced by a processing pipeline unseen during training. This problem known as Cover Source Mismatch (CSM) is particularly hard in realistic settings where practitioners (1) have access to only a small, unlabeled dataset, (2) are unsure of the processing techniques applied to these images, and (3) lack information on the proportion of covers and stegos in that set. To answer this challenge, we introduce TADA (Target Alignment through Data Adaptation), a framework learning to emulate the unknown processing pipeline from a small unlabeled target set. This architecture is trained with a loss combining residual covariance alignment, residual distribution matching, and a $\ell^2$ loss constraining the emulator to produce realistic images. Across toy and operational targets, TADA yields substantial gains in robustness to CSM and improves operational generalization compared to strong holistic and atomistic baselines. Additional resources are available at this link: https://github.com/RonyAbecidan/TADA
Problem

Research questions and friction points this paper is trying to address.

Cover Source Mismatch
Steganalysis
JPEG
Data Adaptation
Unlabeled Data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cover Source Mismatch
Data Adaptation
Steganalysis
Residual Covariance Alignment
Unsupervised Domain Adaptation