Detection of Synthetic Face Images: Accuracy, Robustness, Generalization

📅 2024-06-25

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

This work addresses key challenges in synthetic face detection: poor cross-generator generalization, limited robustness to image degradations (e.g., compression, downscaling) and local hybrid manipulations (e.g., inpainting), and vulnerability to adversarial attacks. To this end, we construct FF5—a large-scale, multi-generator synthetic face dataset—and propose a lightweight, localization-capable detection framework based on the YOLO architecture. Our key contribution is the first systematic detection paradigm tailored to locally hybrid forgeries, which reveals, for the first time, the severe generalization failure of existing detectors against emerging diffusion-based generators (e.g., Realistic Vision). We further enhance robustness via targeted data augmentation and multi-source training. Experiments show near-perfect accuracy (>99%) and precise localization on in-distribution (source-generator) samples; however, performance degrades notably under cross-generator evaluation and adversarial perturbations—highlighting remaining challenges in generalization and security.

Technology Category

Application Category

📝 Abstract

An experimental study on detecting synthetic face images is presented. We collected a dataset, called FF5, of five fake face image generators, including recent diffusion models. We find that a simple model trained on a specific image generator can achieve near-perfect accuracy in separating synthetic and real images. The model handles common image distortions (reduced resolution, compression) by using data augmentation. Moreover, partial manipulations, where synthetic images are blended into real ones by inpainting, are identified and the area of the manipulation is localized by a simple model of YOLO architecture. However, the model turned out to be vulnerable to adversarial attacks and does not generalize to unseen generators. Failure to generalize to detect images produced by a newer generator also occurs for recent state-of-the-art methods, which we tested on Realistic Vision, a fine-tuned version of StabilityAI's Stable Diffusion image generator.

Problem

Research questions and friction points this paper is trying to address.

Detecting synthetic face images from generators

Assessing model robustness against image distortions

Evaluating generalization to unseen synthetic generators

Innovation

Methods, ideas, or system contributions that make the work stand out.

Simple model with data augmentation

YOLO architecture localizes manipulations

Tests generalization on unseen generators

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

[2026] Senior Machine Learning Engineer, Account Identity - PhD Early Career

Roblox

$195,780—$242,100 USD

San Mateo, CA, USA

AI Research Scientist, Computer Vision - Facebook Video Intelligence