Integrated Spoofing-Robust Automatic Speaker Verification via a Three-Class Formulation and LLR

📅 2026-03-14

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This work proposes an end-to-end unified framework for anti-spoofing automatic speaker verification that overcomes the limitations of existing dual-encoder approaches, which suffer from poor interpretability and inflexibility in adapting to new evaluation thresholds. By formulating the task as a three-class classification problem, the model jointly optimizes speaker verification and spoof detection, and derives log-likelihood ratios (LLRs) directly from class logits to enhance decision interpretability. Unlike conventional binary fusion strategies, the proposed method enables flexible threshold adaptation without requiring retraining. Experimental results demonstrate competitive performance on ASVSpoof5 and superior results on SpoofCeleb, while visualization analyses further confirm the model’s improved interpretability.

Technology Category

Application Category

📝 Abstract

Spoofing-robust automatic speaker verification (SASV) aims to integrate automatic speaker verification (ASV) and countermeasure (CM). A popular solution is fusion of independent ASV and CM scores. To better modeling SASV, some frameworks integrate ASV and CM within a single network. However, these solutions are typically bi-encoder based, offer limited interpretability, and cannot be readily adapted to new evaluation parameters without retraining. Based on this, we propose a unified end-to-end framework via a three-class formulation that enables log-likelihood ratio (LLR) inference from class logits for a more interpretable decision pipeline. Experiments show comparable performance to existing methods on ASVSpoof5 and better results on SpoofCeleb. The visualization and analysis also prove that the three-class reformulation provides more interpretability.

Problem

Research questions and friction points this paper is trying to address.

spoofing-robust automatic speaker verification

three-class formulation

log-likelihood ratio

interpretability

end-to-end framework

Innovation

Methods, ideas, or system contributions that make the work stand out.

three-class formulation

log-likelihood ratio (LLR)

spoofing-robust ASV