Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection

📅 2024-11-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the “asymmetry phenomenon” in AI-generated image detection—characterized by low-rank feature representations and poor generalization—arising from overfitting to synthetic artifacts. For the first time, we identify feature rank deficiency as the root cause, framing it through the lens of feature rank constraints. We propose an orthogonal subspace decomposition framework: leveraging SVD, visual foundation model features are decoupled into frozen principal components and trainable orthogonal residual components, explicitly enforcing high-rank representation while preserving pretrained knowledge and enhancing forgery modeling capacity. Unlike full fine-tuning or LoRA, our method overcomes their generalization bottlenecks. Extensive experiments across multiple deepfake and synthetic image benchmarks demonstrate an average 5.2% improvement in cross-dataset detection accuracy, a 37% increase in feature space rank, and significantly superior robustness over state-of-the-art methods.

Technology Category

Application Category

📝 Abstract
AI-generated images (AIGIs), such as natural or face images, have become increasingly realistic and indistinguishable, making their detection a critical and pressing challenge. In this paper, we start from a new perspective to excavate the reason behind the failure generalization in AIGI detection, named the extit{asymmetry phenomenon}, where a naively trained detector tends to favor overfitting to the limited and monotonous fake patterns, causing the feature space to become highly constrained and low-ranked, which is proved seriously limiting the expressivity and generalization. One potential remedy is incorporating the pre-trained knowledge within the vision foundation models (higher-ranked) to expand the feature space, alleviating the model's overfitting to fake. To this end, we employ Singular Value Decomposition (SVD) to decompose the original feature space into two orthogonal subspaces. By freezing the principal components and adapting only the remained components, we preserve the pre-trained knowledge while learning forgery-related patterns. Compared to existing full-parameters and LoRA-based tuning methods, we explicitly ensure orthogonality enabling the higher rank of the whole feature space, effectively minimizing overfitting and enhancing generalization. Extensive experiments with our deep analysis on both deepfake and synthetic image detection benchmarks demonstrate superior generalization performance in detection.
Problem

Research questions and friction points this paper is trying to address.

AI-generated images
forgery detection
asymmetric learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Orthogonal Subspace Decomposition
Pre-trained Image Recognition Model
Singular Value Decomposition
🔎 Similar Papers
No similar papers found.
Z
Zhiyuan Yan
School of Electronic and Computer Engineering, Peking University
J
Jiangming Wang
Tencent Youtu Lab
Z
Zhendong Wang
University of Science and Technology of China
P
Peng Jin
School of Electronic and Computer Engineering, Peking University
Ke-Yue Zhang
Ke-Yue Zhang
Tencent YouTu
facedeep-learning
S
Shen Chen
Tencent Youtu Lab
Taiping Yao
Taiping Yao
Tencent
face anti-spoofing;deepfake;adversial attack
S
Shouhong Ding
Tencent Youtu Lab
Baoyuan Wu
Baoyuan Wu
Associate Professor, CUHK-SZ
AI Security and PrivacyMachine LearningComputer VisionOptimization
Li Yuan
Li Yuan
Research Associate, University of Science & Technology of China (USTC)
Antibiotic resistanceWastewater treatmentEnvironmental bioremediationAnaerobic digestionFate of organic pollutants