🤖 AI Summary
Traditional loss functions in biometric verification struggle to simultaneously optimize intra-class compactness, inter-class separability, and angular margin. To address this, we propose two novel loss functions—Q-Margin and A3M—that jointly integrate α-divergence with angular margin mechanisms—the first such formulation. To mitigate training instability in A3M, we introduce a prototype re-initialization strategy. Our approach operates within the Softmax framework, unifying α-divergence optimization, angular margin constraints, and dynamic prototype calibration. Evaluated on IJB-B, IJB-C (face), and VoxCeleb (speaker) benchmarks, our method achieves significant performance gains under stringent false acceptance rate (FAR ≤ 1e−4) conditions—outperforming state-of-the-art baselines, especially in high-security applications like banking authentication. The core contribution is a theoretically grounded framework establishing compatibility between α-divergence and angular margin, empirically validated to yield substantial discriminative gains at ultra-low FAR thresholds.
📝 Abstract
Performance in face and speaker verification is largely driven by margin based softmax losses like CosFace and ArcFace. Recently introduced $α$-divergence loss functions offer a compelling alternative, particularly for their ability to induce sparse solutions (when $α>1$). However, integrating an angular margin-crucial for verification tasks-is not straightforward. We find this integration can be achieved in at least two distinct ways: via the reference measure (prior probabilities) or via the logits (unnormalized log-likelihoods). In this paper, we explore both pathways, deriving two novel margin-based $α$-divergence losses: Q-Margin (margin in the reference measure) and A3M (margin in the logits). We identify and address a critical training instability in A3M-caused by the interplay of penalized logits and sparsity-with a simple yet effective prototype re-initialization strategy. Our methods achieve significant performance gains on the challenging IJB-B and IJB-C face verification benchmarks. We demonstrate similarly strong performance in speaker verification on VoxCeleb. Crucially, our models significantly outperform strong baselines at low false acceptance rates (FAR). This capability is crucial for practical high-security applications, such as banking authentication, when minimizing false authentications is paramount.