🤖 AI Summary
Discriminators in AI-generated image detection suffer a structural disadvantage in generator-discriminator adversarial settings, yet the underlying mechanisms linking data dimensionality and intrinsic complexity to detection performance remain poorly understood.
Method: We systematically investigate this relationship by introducing Kolmogorov complexity as a theoretically grounded metric for quantifying intrinsic data structure. Combining high-dimensional distribution analysis with generative model behavioral modeling, we rigorously quantify both the difficulty of distribution learning for generators and the visibility of discriminative errors across varying complexity regimes.
Contribution/Results: Experiments reveal a non-monotonic detection performance curve: discriminators achieve peak accuracy on images of medium intrinsic complexity, while performance degrades significantly for both low-complexity (highly regular) and high-complexity (chaotic, disordered) images. This uncovers a fundamental non-monotonicity in image detectability—challenging conventional assumptions—and establishes a new theoretical framework for characterizing the inherent limits of generative-AI detection.
📝 Abstract
The rapid progress of image generative AI has blurred the boundary between synthetic and real images, fueling an arms race between generators and discriminators. This paper investigates the conditions under which discriminators are most disadvantaged in this competition. We analyze two key factors: data dimensionality and data complexity. While increased dimensionality often strengthens the discriminators ability to detect subtle inconsistencies, complexity introduces a more nuanced effect. Using Kolmogorov complexity as a measure of intrinsic dataset structure, we show that both very simple and highly complex datasets reduce the detectability of synthetic images; generators can learn simple datasets almost perfectly, whereas extreme diversity masks imperfections. In contrast, intermediate-complexity datasets create the most favorable conditions for detection, as generators fail to fully capture the distribution and their errors remain visible.