🤖 AI Summary
This work proposes a multi-stage dynamic analysis framework powered by a large language model (LLM) agent to address the limitations of existing AI-generated image detection methods, which often rely on a single type of forgery cue and consequently suffer from constrained performance and conflicting outcomes. By synergistically integrating semantic and signal-level evidence, the framework enables context-aware incorporation of expert knowledge and resolves inconsistencies through an expert profiling knowledge base, contextual clustering, and a rapid ensemble mechanism. The resulting paradigm is both scalable and interpretable, producing fine-grained, human-readable forensic reports that significantly enhance detection reliability and practical deployability in real-world scenarios.
📝 Abstract
The increasing realism of AI-Generated Images (AIGI) has created an urgent need for forensic tools capable of reliably distinguishing synthetic content from authentic imagery. Existing detectors are typically tailored to specific forgery artifacts--such as frequency-domain patterns or semantic inconsistencies--leading to specialized performance and, at times, conflicting judgments. To address these limitations, we present \textbf{AgentFoX}, a Large Language Model-driven framework that redefines AIGI detection as a dynamic, multi-phase analytical process. Our approach employs a quick-integration fusion mechanism guided by a curated knowledge base comprising calibrated Expert Profiles and contextual Clustering Profiles. During inference, the agent begins with high-level semantic assessment, then transitions to fine-grained, context-aware synthesis of signal-level expert evidence, resolving contradictions through structured reasoning. Instead of returning a coarse binary output, AgentFoX produces a detailed, human-readable forensic report that substantiates its verdict, enhancing interpretability and trustworthiness for real-world deployment. Beyond providing a novel detection solution, this work introduces a scalable agentic paradigm that facilitates intelligent integration of future and evolving forensic tools.