Real-Aware Residual Model Merging for Deepfake Detection

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deepfake generators evolve rapidly, making retraining detection models prohibitively expensive. To address this, we propose Real-aware Residual Model Merging (R²M), a training-free parameter-space model fusion framework. R²M is the first to integrate low-rank decomposition with residual merging: it decomposes task vectors into low-rank components, applies layer-wise rank truncation for denoising, and enforces task-specific norm matching—thereby explicitly disentangling and fusing shared authentic features and forgery-specific residuals across expert models. Crucially, R²M supports dynamic expansion, enabling seamless integration of detectors for newly emerging forgery types without retraining. Experiments demonstrate that R²M significantly outperforms joint training and state-of-the-art model merging methods across in-distribution, cross-dataset, and zero-shot unseen forgery scenarios. It achieves superior generalization and scalability while requiring no additional training.

Technology Category

Application Category

📝 Abstract
Deepfake generators evolve quickly, making exhaustive data collection and repeated retraining impractical. We argue that model merging is a natural fit for deepfake detection: unlike generic multi-task settings with disjoint labels, deepfake specialists share the same binary decision and differ in generator-specific artifacts. Empirically, we show that simple weight averaging preserves Real representations while attenuating Fake-specific cues. Building upon these findings, we propose Real-aware Residual Model Merging (R$^2$M), a training-free parameter-space merging framework. R$^2$M estimates a shared Real component via a low-rank factorization of task vectors, decomposes each specialist into a Real-aligned part and a Fake residual, denoises residuals with layerwise rank truncation, and aggregates them with per-task norm matching to prevent any single generator from dominating. A concise rationale explains why a simple head suffices: the Real component induces a common separation direction in feature space, while truncated residuals contribute only minor off-axis variations. Across in-distribution, cross-dataset, and unseen-dataset, R$^2$M outperforms joint training and other merging baselines. Importantly, R$^2$M is also composable: when a new forgery family appears, we fine-tune one specialist and re-merge, eliminating the need for retraining.
Problem

Research questions and friction points this paper is trying to address.

Addresses impractical exhaustive data collection for evolving deepfake generators
Proposes training-free model merging to preserve real representations
Enhances detection across in-distribution and unseen datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free parameter-space merging for deepfake detection
Decomposes specialists into Real-aligned and Fake residual components
Denoises residuals via layerwise rank truncation and norm matching
🔎 Similar Papers
No similar papers found.
J
Jinhee Park
Korea Electronics Technology Institute (KETI), Republic of Korea
G
Guisik Kim
Korea Electronics Technology Institute (KETI), Republic of Korea
C
Choongsang Cho
Korea Electronics Technology Institute (KETI), Republic of Korea
Junseok Kwon
Junseok Kwon
Chung-Ang University
Computer VisionMachine Learning