Audio Forensics Evaluation (SAFE) Challenge

📅 2025-10-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The increasing realism of text-to-speech (TTS) synthesis, coupled with evasive post-processing—such as compression, resampling, and laundering—poses significant challenges to audio forensics detection. Method: This paper introduces the first fully blind, multi-stage, systematic evaluation framework. It integrates 17 mainstream TTS models to generate 21,000 samples across three progressively challenging scenarios: pristine, compressed, and laundered audio—constituting a 90-hour multitask benchmark. The dataset encompasses diverse real-world recordings and multiple laundering attacks. Contributions/Results: (1) We release the first large-scale, robustness-oriented benchmark for synthetic speech detection, featuring three distinct subtasks; (2) we empirically demonstrate substantial performance degradation of existing detectors on laundered audio; and (3) we advance standardization in audio forensics by establishing a reproducible, comprehensive evaluation paradigm—providing both a rigorous baseline and a clear technical roadmap for future research.

Technology Category

Application Category

📝 Abstract
The increasing realism of synthetic speech generated by advanced text-to-speech (TTS) models, coupled with post-processing and laundering techniques, presents a significant challenge for audio forensic detection. In this paper, we introduce the SAFE (Synthetic Audio Forensics Evaluation) Challenge, a fully blind evaluation framework designed to benchmark detection models across progressively harder scenarios: raw synthetic speech, processed audio (e.g., compression, resampling), and laundered audio intended to evade forensic analysis. The SAFE challenge consisted of a total of 90 hours of audio and 21,000 audio samples split across 21 different real sources and 17 different TTS models and 3 tasks. We present the challenge, evaluation design and tasks, dataset details, and initial insights into the strengths and limitations of current approaches, offering a foundation for advancing synthetic audio detection research. More information is available at href{https://stresearch.github.io/SAFE/}{https://stresearch.github.io/SAFE/}.
Problem

Research questions and friction points this paper is trying to address.

Evaluating detection of synthetic speech in raw, processed, and laundered audio
Benchmarking forensic models against advanced TTS and evasion techniques
Assessing current approaches' limitations in synthetic audio forensics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Blind evaluation framework for audio forensic detection
Benchmarking detection models across harder scenarios
Evaluating raw, processed, and laundered synthetic audio
🔎 Similar Papers
No similar papers found.
Kirill Trapeznikov
Kirill Trapeznikov
Systems & Technology Research
machine learning
P
Paul Cummer
STR, Woburn, MA
P
Pranay Pherwani
STR, Woburn, MA
J
Jai Aslam
STR, Woburn, MA
Michael S. Davinroy
Michael S. Davinroy
PhD Student, Northeastern University
AI and Computer Security
P
Peter Bautista
Aptima, Inc., Woburn, MA
L
Laura Cassani
Aptima, Inc., Woburn, MA
M
Matthew Stamm
Drexel University, Philadelphia, PA
J
Jill Crisman
ULRI Digital Safety Research Institute, Northbrook, IL