TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection

📅 2025-05-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current deepfake detection research lacks large-scale, standardized benchmarks and specialized methods for human body manipulation. To address this, we introduce TT-DF—the first large-scale diffusion-based human forgery video dataset—comprising 6,120 videos (1.37 million frames), spanning diverse human animation models, identity-pose disentangled generation configurations, and compression variants. We further propose TOF-Net, a dedicated architecture for human forgery detection: it leverages RAFT optical flow to model temporal inconsistencies and integrates 3D human motion priors with temporal convolutions, overcoming the limited transferability of face-centric detectors to full-body forgery scenarios. Extensive experiments demonstrate that TOF-Net significantly outperforms state-of-the-art scalable facial forgery detectors on TT-DF. Both the dataset and code are publicly released, establishing a new standard benchmark for human body forgery detection research.

Technology Category

Application Category

📝 Abstract
The emergence and popularity of facial deepfake methods spur the vigorous development of deepfake datasets and facial forgery detection, which to some extent alleviates the security concerns about facial-related artificial intelligence technologies. However, when it comes to human body forgery, there has been a persistent lack of datasets and detection methods, due to the later inception and complexity of human body generation methods. To mitigate this issue, we introduce TikTok-DeepFake (TT-DF), a novel large-scale diffusion-based dataset containing 6,120 forged videos with 1,378,857 synthetic frames, specifically tailored for body forgery detection. TT-DF offers a wide variety of forgery methods, involving multiple advanced human image animation models utilized for manipulation, two generative configurations based on the disentanglement of identity and pose information, as well as different compressed versions. The aim is to simulate any potential unseen forged data in the wild as comprehensively as possible, and we also furnish a benchmark on TT-DF. Additionally, we propose an adapted body forgery detection model, Temporal Optical Flow Network (TOF-Net), which exploits the spatiotemporal inconsistencies and optical flow distribution differences between natural data and forged data. Our experiments demonstrate that TOF-Net achieves favorable performance on TT-DF, outperforming current state-of-the-art extendable facial forgery detection models. For our TT-DF dataset, please refer to https://github.com/HashTAG00002/TT-DF.
Problem

Research questions and friction points this paper is trying to address.

Lack of datasets for human body forgery detection
Need for diverse forgery methods in body manipulation
Absence of robust detection models for body deepfakes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale diffusion-based dataset for body forgery
Multiple advanced human image animation models
Temporal Optical Flow Network for detection
🔎 Similar Papers
No similar papers found.
W
Wenkui Yang
MAIS & NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
Z
Zhida Zhang
MAIS & NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China
X
Xiaoqiang Zhou
University of Science and Technology of China, Hefei, China
Junxian Duan
Junxian Duan
Institute of Automation, Chinese Academy of Sciences
computer vision
J
Jie Cao
MAIS & NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China