FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The increasing photorealism of generative images poses significant challenges for reliable detection. To address this, we propose FerretNet—a lightweight detection network (1.1M parameters) that models local pixel dependencies, and the first to incorporate Markov Random Field (MRF) theory into synthetic image detection. FerretNet exploits inherent distributional biases in the generative process and decoding-induced smoothing effects to capture local inconsistencies in texture and edge structures. Its architecture integrates local dependency modeling, parameter-efficient convolutional design, and reconstruction error analysis. Trained exclusively on ProGAN-generated images from four classes, FerretNet achieves a mean accuracy of 97.1% on an open-world benchmark spanning 22 diverse generative models—surpassing state-of-the-art methods by 10.6 percentage points. This demonstrates both strong generalization across architectures and superior discriminative capability for subtle, localized artifacts.

Technology Category

Application Category

📝 Abstract
The increasing realism of synthetic images generated by advanced models such as VAEs, GANs, and LDMs poses significant challenges for synthetic image detection. To address this issue, we explore two artifact types introduced during the generation process: (1) latent distribution deviations and (2) decoding-induced smoothing effects, which manifest as inconsistencies in local textures, edges, and color transitions. Leveraging local pixel dependencies (LPD) properties rooted in Markov Random Fields, we reconstruct synthetic images using neighboring pixel information to expose disruptions in texture continuity and edge coherence. Building upon LPD, we propose FerretNet, a lightweight neural network with only 1.1M parameters that delivers efficient and robust synthetic image detection. Extensive experiments demonstrate that FerretNet, trained exclusively on the 4-class ProGAN dataset, achieves an average accuracy of 97.1% on an open-world benchmark comprising across 22 generative models, surpassing state-of-the-art methods by 10.6%.
Problem

Research questions and friction points this paper is trying to address.

Detecting increasingly realistic synthetic images from advanced models
Addressing latent distribution deviations and decoding-induced smoothing artifacts
Exposing texture and edge inconsistencies via local pixel dependencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses local pixel dependencies for detection
Reconstructs images to expose texture disruptions
Lightweight network with 1.1M parameters
🔎 Similar Papers
No similar papers found.
S
Shuqiao Liang
Jinan University
J
Jian Liu
Jinan University
R
Renzhang Chen
Jinan University
Quanlong Guan
Quanlong Guan
Jinan University
Multimodal LearningRepresentation learningRecommendation SystemAI in education