Neural Codec Source Tracing: Toward Comprehensive Attribution in Open-Set Condition

📅 2025-01-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Neural codec-based audio provenance under open-set conditions remains challenging—existing methods struggle to identify unknown forgery techniques and exhibit poor robustness against out-of-distribution (OOD) authentic audio. Method: We formalize NCST (Neural Codec Source Tracking) as a novel task and introduce ST-Codecfake, the first bilingual, multi-codec, OOD-enriched benchmark for audio source attribution. Our approach integrates open-set classification, Adversarial Logit Matching (ALM)-based interpretable anomaly detection, and multi-source neural codec feature modeling. Contribution/Results: Experiments demonstrate state-of-the-art performance in both in-distribution (ID) source classification and OOD detection. Crucially, our analysis reveals insufficient cross-domain generalization of authentic audio as a fundamental bottleneck. The ST-Codecfake dataset and source code are publicly released to foster reproducible research.

Technology Category

Application Category

📝 Abstract
Current research in audio deepfake detection is gradually transitioning from binary classification to multi-class tasks, referred as audio deepfake source tracing task. However, existing studies on source tracing consider only closed-set scenarios and have not considered the challenges posed by open-set conditions. In this paper, we define the Neural Codec Source Tracing (NCST) task, which is capable of performing open-set neural codec classification and interpretable ALM detection. Specifically, we constructed the ST-Codecfake dataset for the NCST task, which includes bilingual audio samples generated by 11 state-of-the-art neural codec methods and ALM-based out-ofdistribution (OOD) test samples. Furthermore, we establish a comprehensive source tracing benchmark to assess NCST models in open-set conditions. The experimental results reveal that although the NCST models perform well in in-distribution (ID) classification and OOD detection, they lack robustness in classifying unseen real audio. The ST-codecfake dataset and code are available.
Problem

Research questions and friction points this paper is trying to address.

Audio Forensics
Unknown Forgery Detection
Adaptive Identification Methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural Encoder Tracking (NCST)
Audio Forgery Detection
ST-Codecfake Database
🔎 Similar Papers
No similar papers found.