🤖 AI Summary
This work proposes LEADER, a lightweight end-to-end network for fingerprint minutiae extraction that eliminates the need for cumbersome preprocessing and postprocessing steps inherent in existing methods. LEADER employs a dual autoencoder architecture enhanced with an attention-gated mechanism to directly predict minutiae location, orientation, and type from raw fingerprint images. The approach introduces a novel “castle-moat-fortress” ground-truth encoding scheme and integrates ensemble non-maximum suppression with an angle decoding module. With only 0.9 million parameters, LEADER achieves remarkable cross-domain robustness, improving the F1-score by 34% on NIST SD27, ranking first on 47% of test samples, and delivering inference in just 15 ms on a GPU—outperforming leading commercial software.
📝 Abstract
Minutiae extraction, a fundamental stage in fingerprint recognition, is increasingly shifting toward deep learning. However, truly end-to-end methods that eliminate separate preprocessing and postprocessing steps remain scarce. This paper introduces LEADER (Lightweight End-to-end Attention-gated Dual autoencodER), a neural network that maps raw fingerprint images to minutiae descriptors, including location, direction, and type. The proposed architecture integrates non-maximum suppression and angular decoding to enable complete end-to-end inference using only 0.9M parameters. It employs a novel "Castle-Moat-Rampart" ground-truth encoding and a dual-autoencoder structure, interconnected through an attention-gating mechanism. Experimental evaluations demonstrate state-of-the-art accuracy on plain fingerprints and robust cross-domain generalization to latent impressions. Specifically, LEADER attains a 34% higher F1-score on the NIST SD27 dataset compared to specialized latent minutiae extractors. Sample-level analysis on this challenging benchmark reveals an average rank of 2.07 among all compared methods, with LEADER securing the first-place position in 47% of the samples-more than doubling the frequency of the second-best extractor. The internal representations learned by the model align with established fingerprint domain features, such as segmentation masks, orientation fields, frequency maps, and skeletons. Inference requires 15ms on GPU and 322ms on CPU, outperforming leading commercial software in computational efficiency. The source code and pre-trained weights are publicly released to facilitate reproducibility.