🤖 AI Summary
Existing cosine-similarity- and Softmax-based face recognition methods exhibit insufficient discriminative power on challenging high-quality samples. To address this, we propose LH2Face, a novel loss function. Our approach models face features on the hypersphere using the von Mises–Fisher (vMF) distribution—replacing conventional Euclidean or cosine metrics—and introduces an uncertainty-aware adaptive margin function that jointly characterizes sample difficulty and quality. Furthermore, LH2Face integrates proxy-based classification loss with a face reconstruction task within a multi-task optimization framework. Evaluated on the IJB-B benchmark, LH2Face achieves 49.39% true positive rate at a false acceptance rate of 1e−4, outperforming the second-best method by 2.37%. This demonstrates substantial improvement in recognizing high-quality yet difficult samples.
📝 Abstract
In current practical face authentication systems, most face recognition (FR) algorithms are based on cosine similarity with softmax classification. Despite its reliable classification performance, this method struggles with hard samples. A popular strategy to improve FR performance is incorporating angular or cosine margins. However, it does not take face quality or recognition hardness into account, simply increasing the margin value and thus causing an overly uniform training strategy. To address this problem, a novel loss function is proposed, named Loss function for Hard High-quality Face (LH2Face). Firstly, a similarity measure based on the von Mises-Fisher (vMF) distribution is stated, specifically focusing on the logarithm of the Probability Density Function (PDF), which represents the distance between a probability distribution and a vector. Then, an adaptive margin-based multi-classification method using softmax, called the Uncertainty-Aware Margin Function, is implemented in the article. Furthermore, proxy-based loss functions are used to apply extra constraints between the proxy and sample to optimize their representation space distribution. Finally, a renderer is constructed that optimizes FR through face reconstruction and vice versa. Our LH2Face is superior to similiar schemes on hard high-quality face datasets, achieving 49.39% accuracy on the IJB-B dataset, which surpasses the second-place method by 2.37%.