Multilingual Source Tracing of Speech Deepfakes: A First Benchmark

📅 2025-08-06

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This paper addresses the problem of source model provenance tracking for multilingual speech deepfakes. It introduces the first open-source benchmark—MultiLingual Deepfake Provenance Benchmark—that systematically covers both monolingual and cross-lingual scenarios, and empirically characterizes how training-inference language mismatch affects provenance accuracy. Methodologically, it integrates digital signal processing (DSP) features with self-supervised speech representations (SSL), quantitatively evaluating how multilingual fine-tuning enhances cross-lingual generalization and assessing robustness to unseen languages and speakers. Key contributions include: (1) the first dedicated multilingual deepfake provenance benchmark; (2) a reproducible, standardized evaluation protocol for cross-lingual provenance attribution; and (3) publicly released datasets, code, and models—establishing critical infrastructure and empirical foundations for research in speech deepfake provenance.

Technology Category

Application Category

📝 Abstract

Recent progress in generative AI has made it increasingly easy to create natural-sounding deepfake speech from just a few seconds of audio. While these tools support helpful applications, they also raise serious concerns by making it possible to generate convincing fake speech in many languages. Current research has largely focused on detecting fake speech, but little attention has been given to tracing the source models used to generate it. This paper introduces the first benchmark for multilingual speech deepfake source tracing, covering both mono- and cross-lingual scenarios. We comparatively investigate DSP- and SSL-based modeling; examine how SSL representations fine-tuned on different languages impact cross-lingual generalization performance; and evaluate generalization to unseen languages and speakers. Our findings offer the first comprehensive insights into the challenges of identifying speech generation models when training and inference languages differ. The dataset, protocol and code are available at https://github.com/xuanxixi/Multilingual-Source-Tracing.

Problem

Research questions and friction points this paper is trying to address.

Tracing source models of multilingual deepfake speech

Evaluating cross-lingual generalization in speech model tracing

Assessing performance on unseen languages and speakers

Innovation

Methods, ideas, or system contributions that make the work stand out.

First benchmark for multilingual speech deepfake tracing

Investigates DSP- and SSL-based modeling techniques

Evaluates cross-lingual generalization performance

🔎 Similar Papers

A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection