🤖 AI Summary
Existing methods for detecting 3D symmetry from a single image are limited to synthetic or object-centric datasets, hindering generalization to real-world scenes, and due to monocular scale ambiguity, they can only predict the orientation of symmetry planes without precise 3D localization. This work proposes the first framework for detecting grounded reflectional symmetry in architectural scenes from in-the-wild single RGB images. It introduces ArchSym, a large-scale architectural symmetry dataset automatically constructed via structure-from-motion (SfM) reconstruction and cross-view matching, and leverages predicted scene geometry to parameterize symmetry planes using signed distance maps, enabling accurate 3D localization. Experiments demonstrate that the proposed automatic annotation pipeline outperforms geometric baselines, and the detector significantly surpasses existing methods on ArchSym.
📝 Abstract
Symmetry detection is a fundamental problem in computer vision, and symmetries serve as powerful priors for downstream tasks. However, existing learning-based methods for detecting 3D symmetries from single images have been almost exclusively trained and evaluated on object-centric or synthetic datasets, and thus fail to generalize to real-world scenes. Furthermore, due to the inherent scale ambiguity of monocular inputs, which makes localizing the 3D plane an ill-posed problem, many existing works only predict the plane's orientation. In this paper, we address these limitations by presenting the first framework for detecting 3D-grounded reflectional symmetries from single, in-the-wild RGB images, focusing on architectural landmarks. We introduce two key innovations: (1) a scalable data annotation pipeline to automatically curate a large-scale dataset of architectural symmetries, ArchSym, from SfM reconstructions by leveraging cross-view image matching; and building on the dataset, (2) a single-view symmetry detector that accurately localizes symmetries in 3D by parameterizing them as signed distance maps defined relative to predicted scene geometry. We validate our symmetry annotation pipeline against geometry-based alternatives and demonstrate that our symmetry detector significantly outperforms state-of-the-art baselines on our new benchmark.