🤖 AI Summary
This work addresses the significant performance degradation of ear biometrics in unconstrained environments caused by occlusion from earrings. We propose, for the first time, an accessory-aware diffusion inpainting model that leverages automatically generated earring masks to reconstruct occluded regions, producing anatomically plausible and geometrically consistent full ear images as a preprocessing step prior to recognition. The method effectively restores critical structures such as the helix and antihelix, substantially improving the identification accuracy of subsequent Vision Transformer (ViT) models. Extensive experiments across multiple benchmark datasets demonstrate that the proposed preprocessing strategy consistently enhances recognition robustness across various ViT architectures and patch sizes.
📝 Abstract
Ear occlusions (arising from the presence of ear accessories such as earrings and earphones) can negatively impact performance in ear-based biometric recognition systems, especially in unconstrained imaging circumstances. In this study, we assess the effectiveness of a diffusion-based ear inpainting technique as a pre-processing aid to mitigate the issues of ear accessory occlusions in transformer-based ear recognition systems. Given an input ear image and an automatically derived accessory mask, the inpainting model reconstructs clean and anatomically plausible ear regions by synthesizing missing pixels while preserving local geometric coherence along key ear structures, including the helix, antihelix, concha, and lobule. We evaluate the effectiveness of this pre-processing aid in transformer-based recognition systems for several vision transformer models and different patch sizes for a range of benchmark datasets. Experiments show that diffusion-based inpainting can be a useful pre-processing aid to alleviate ear accessory occlusions to improve overall recognition performance.