Beyond Surface Artifacts: Capturing Shared Latent Forgery Knowledge Across Modalities

📅 2026-04-08

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This work addresses the limited generalizability of existing deepfake detection methods, which rely on modality-specific superficial artifacts and fail to detect unseen “dark modality” forgeries. To overcome this, the authors propose a Modality-Agnostic Forgery (MAF) detection framework that decouples modality-specific styles and learns a shared latent representation of forgery traces across modalities, thereby shifting the paradigm from feature fusion to true modality generalization. The study introduces two novel generalization scenarios—weak MAF and strong MAF—and establishes the DeepModal-Bench benchmark to empirically validate the existence and learnability of universal forgery signatures. Experimental results demonstrate that the proposed approach significantly enhances detection robustness on unseen modalities, offering a pioneering technical pathway toward multimodal deepfake defense.

Technology Category

Application Category

📝 Abstract

As generative artificial intelligence evolves, deepfake attacks have escalated from single-modality manipulations to complex, multimodal threats. Existing forensic techniques face a severe generalization bottleneck: by relying excessively on superficial, modality-specific artifacts, they neglect the shared latent forgery knowledge hidden beneath variable physical appearances. Consequently, these models suffer catastrophic performance degradation when confronted with unseen "dark modalities." To break this limitation, this paper introduces a paradigm shift that redefines multimodal forensics from conventional "feature fusion" to "modality generalization." We propose the first modality-agnostic forgery (MAF) detection framework. By explicitly decoupling modality-specific styles, MAF precisely extracts the essential, cross-modal latent forgery knowledge. Furthermore, we define two progressive dimensions to quantify model generalization: transferability toward semantically correlated modalities (Weak MAF), and robustness against completely isolated signals of "dark modality" (Strong MAF). To rigorously assess these generalization limits, we introduce the DeepModal-Bench benchmark, which integrates diverse multimodal forgery detection algorithms and adapts state-of-the-art generalized learning methods. This study not only empirically proves the existence of universal forgery traces but also achieves significant performance breakthroughs on unknown modalities via the MAF framework, offering a pioneering technical pathway for universal multimodal defense.

Problem

Research questions and friction points this paper is trying to address.

multimodal forensics

deepfake detection

modality generalization

latent forgery knowledge

dark modality

Innovation

Methods, ideas, or system contributions that make the work stand out.

modality-agnostic forgery

latent forgery knowledge

multimodal generalization