UniShield: An Adaptive Multi-Agent Framework for Unified Forgery Image Detection and Localization

📅 2025-10-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Misinformation and fraud stemming from image forgery—including tampering, document forgery, DeepFakes, and AI-generated content—pose growing threats, yet existing detection methods suffer from poor generalization, weak domain adaptability, and a lack of unified, interpretable frameworks. To address these challenges, we propose FIDL, the first adaptive detection and localization framework for multi-domain forgery. FIDL employs a multi-agent system: a perception agent dynamically schedules models based on input characteristics, while a detection agent fuses outputs from multiple expert modules and generates human-interpretable forensic reports. Crucially, FIDL introduces feature-adaptive selection and cross-domain knowledge transfer mechanisms to enhance generalization across diverse forgery types. Evaluated across 12 benchmarks spanning four forgery categories (tampering, document forgery, DeepFakes, and AI generation), FIDL achieves state-of-the-art performance—outperforming both specialized detectors and prior unified approaches—demonstrating superior practicality, adaptability, and scalability.

Technology Category

Application Category

📝 Abstract
With the rapid advancements in image generation, synthetic images have become increasingly realistic, posing significant societal risks, such as misinformation and fraud. Forgery Image Detection and Localization (FIDL) thus emerges as essential for maintaining information integrity and societal security. Despite impressive performances by existing domain-specific detection methods, their practical applicability remains limited, primarily due to their narrow specialization, poor cross-domain generalization, and the absence of an integrated adaptive framework. To address these issues, we propose UniShield, the novel multi-agent-based unified system capable of detecting and localizing image forgeries across diverse domains, including image manipulation, document manipulation, DeepFake, and AI-generated images. UniShield innovatively integrates a perception agent with a detection agent. The perception agent intelligently analyzes image features to dynamically select suitable detection models, while the detection agent consolidates various expert detectors into a unified framework and generates interpretable reports. Extensive experiments show that UniShield achieves state-of-the-art results, surpassing both existing unified approaches and domain-specific detectors, highlighting its superior practicality, adaptiveness, and scalability.
Problem

Research questions and friction points this paper is trying to address.

Detecting and localizing diverse forgery types across domains
Overcoming limited generalization of specialized detection methods
Integrating adaptive multi-agent framework for unified image analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework integrates perception and detection agents
Perception agent dynamically selects models based on features
Detection agent unifies expert detectors for interpretable reports
Qing Huang
Qing Huang
Chinese Academy of Science
Material Editing
Z
Zhipei Xu
School of Electronic and Computer Engineering, Peking University
X
Xuanyu Zhang
School of Electronic and Computer Engineering, Peking University
J
Jian Zhang
School of Electronic and Computer Engineering, Peking University