🤖 AI Summary
To address the limitations of existing instance shadow detection methods—namely, separate detection of shadows and objects and inaccurate post-hoc pairing—this paper proposes the first query-driven end-to-end framework. Methodologically, we design a dual-path associative Transformer decoder that jointly models geometric and semantic relationships between shadows and objects in a single forward pass, enabling unified learning of detection and precise shadow-object association. Crucially, we introduce a learnable, end-to-end association mechanism, eliminating error-prone heuristic matching during post-processing. On the SOBA benchmark, our method achieves a 4.2% mAP improvement over prior state-of-the-art (e.g., SSISv2). Moreover, it supports real-time inference at medium resolution (1024×512) with 32 FPS, striking an effective balance between accuracy and efficiency.
📝 Abstract
Instance shadow detection is the task of detecting pairs of shadows and objects, where existing methods first detect shadows and objects independently, then associate them. This paper introduces FastInstShadow, a method that enhances detection accuracy through a query-based architecture featuring an association transformer decoder with two dual-path transformer decoders to assess relationships between shadows and objects during detection. Experimental results using the SOBA dataset showed that the proposed method outperforms all existing methods across all criteria. This method makes real-time processing feasible for moderate-resolution images with better accuracy than SSISv2, the most accurate existing method. Our code is available at https://github.com/wlotkr/FastInstShadow.