๐ค AI Summary
Glass surface detection is highly challenging due to its transparency and lack of distinctive visual features; existing methods largely rely on bounding boxes or static reflection cues, resulting in poor localization accuracy. To address this, we propose NFGlassNetโthe first method to model the intrinsic optical properties of glass by explicitly capturing the dynamic appearance/disappearance of reflections in flash/no-flash image pairs. Our architecture introduces two novel modules: (1) a Reflection Contrast Mining Module (RCMM) that explicitly encodes reflection differences between the paired images, and (2) a Reflection-Guided Attention Module (RGAM) that adaptively focuses on glass regions via reflection-driven attention. Evaluated on our newly constructed dataset of 3.3K flash/no-flash image pairs, NFGlassNet significantly outperforms state-of-the-art methods in both accuracy and generalization, demonstrating robust localization performance and strong potential for real-world deployment.
๐ Abstract
Glass surfaces are ubiquitous in daily life, typically appearing colorless, transparent, and lacking distinctive features. These characteristics make glass surface detection a challenging computer vision task. Existing glass surface detection methods always rely on boundary cues (e.g., window and door frames) or reflection cues to locate glass surfaces, but they fail to fully exploit the intrinsic properties of the glass itself for accurate localization. We observed that in most real-world scenes, the illumination intensity in front of the glass surface differs from that behind it, which results in variations in the reflections visible on the glass surface. Specifically, when standing on the brighter side of the glass and applying a flash towards the darker side, existing reflections on the glass surface tend to disappear. Conversely, while standing on the darker side and applying a flash towards the brighter side, distinct reflections will appear on the glass surface. Based on this phenomenon, we propose NFGlassNet, a novel method for glass surface detection that leverages the reflection dynamics present in flash/no-flash imagery. Specifically, we propose a Reflection Contrast Mining Module (RCMM) for extracting reflections, and a Reflection Guided Attention Module (RGAM) for fusing features from reflection and glass surface for accurate glass surface detection. For learning our network, we also construct a dataset consisting of 3.3K no-flash and flash image pairs captured from various scenes with corresponding ground truth annotations. Extensive experiments demonstrate that our method outperforms the state-of-the-art methods. Our code, model, and dataset will be available upon acceptance of the manuscript.