The DeepSpeak Dataset

📅 2024-08-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deepfake detection research is hindered by the scarcity of high-quality, multimodal, and dynamically evolving benchmark datasets. Method: We introduce DeepSpeak—a large-scale, open-source real-fake video benchmark comprising 50 hours of authentic webcam conversation videos (featuring 500 diverse individuals) and an equal volume of state-of-the-art synthetic videos. It spans multiple deepfake generation paradigms—including avatar synthesis, face swapping, lip synchronization, and AI-based speech synthesis—and employs standardized acquisition protocols with fine-grained metadata annotation. Crucially, DeepSpeak pioneers real-time, conversational, multimodal deepfake samples and supports versioned updates (v1.0–v2.0) to incorporate emerging forgery techniques, with clear commercial/non-commercial licensing distinctions. Contribution/Results: DeepSpeak has been widely adopted in deepfake detection, media authenticity assessment, and AIGC governance research, serving as a standard evaluation dataset in multiple international benchmarks.

Technology Category

Application Category

📝 Abstract
We describe a large-scale dataset - DeepSpeak - of real and deepfake footage of people talking and gesturing in front of their webcams. The real videos in this dataset consist of a total of 50 hours of footage from 500 diverse individuals. Constituting more than 50 hours of footage, the fake videos consist of a range of different state-of-the-art avatar, face-swap, and lip-sync deepfakes with natural and AI-generated voices. We are regularly releasing updated versions of this dataset with the latest deepfake technologies. This preprint describes the construction of versions 1.0, 1.1, and 2.0. This dataset is made freely available for research and non-commercial uses; requests for commercial use will be considered.
Problem

Research questions and friction points this paper is trying to address.

Creating a large-scale dataset of real and deepfake talking videos
Including diverse deepfake types like avatar, face-swap, lip-sync
Providing dataset for research on latest deepfake technologies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale dataset of real and deepfake videos
Includes state-of-the-art avatar and face-swap technologies
Regular updates with latest deepfake techniques
🔎 Similar Papers
No similar papers found.