LaRE2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection

📅 2024-03-26

🏛️ Computer Vision and Pattern Recognition

📈 Citations: 11

✨ Influential: 0

career value

172K/year

🤖 AI Summary

The rapid advancement of diffusion-based image generation has rendered synthetic images increasingly indistinguishable from real ones, posing severe privacy and security threats. To address this, we propose LaRE², a latent-space reconstruction error–based detection method. LaRE² is the first to define reconstruction error in the latent space as a discriminative feature and introduces an Error-Guided Reconstruction Enhancement (EGRE) module that jointly optimizes spatial and channel dimensions via a “first-align-then-refine” strategy. Crucially, LaRE² operates without fine-tuning the generative models, ensuring strong generalizability across diverse generators. Evaluated on the GenImage benchmark, LaRE² achieves state-of-the-art performance, outperforming prior methods by 11.9% in average accuracy (ACC) and 12.1% in average precision (AP), while accelerating feature extraction by 8×.

Technology Category

Application Category

📝 Abstract

The evolution of Diffusion Models has dramatically improved image generation quality, making it increasingly difficult to differentiate between real and generated images. This development, while impressive, also raises significant privacy and security concerns. In response to this, we propose a novel Latent REconstruction error guided feature REfinement method (LaRE2) for detecting the diffusion-generated images. We come up with the Latent Reconstruction Error (LaRE), the first reconstruction-error based feature in the latent space for generated image detection. LaRE surpasses existing methods in terms of feature extraction efficiency while preserving crucial cues required to differentiate between the real and the fake. To exploit LaRE, we propose an Error-Guided feature REfinement module (EGRE), which can refine the image feature guided by LaRE to enhance the discriminativeness of the feature. Our EGRE utilizes an align-then-refine mechanism, which effectively refines the image feature for generated-image detection from both spatial and channel perspectives. Extensive experiments on the large-scale GenImage benchmark demonstrate the superiority of our LaRE2, which surpasses the best SoTA method by up to 11.9%/12.1% average ACC/AP across 8 different image generators. LaRE also surpasses existing methods in terms of feature extraction cost, delivering an impressive speed enhancement of 8 times.

Problem

Research questions and friction points this paper is trying to address.

Detects diffusion-generated images

Enhances feature extraction efficiency

Improves image feature discriminativeness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent Reconstruction Error feature

Error-Guided feature Refinement module

Align-then-refine mechanism

🔎 Similar Papers

Diffusion Noise Feature: Accurate and Fast Generated Image Detection