🤖 AI Summary
Camera pose estimation in object-centric scenes is highly susceptible to background texture interference, particularly when the background exhibits high viewpoint consistency. To address this, we propose a novel adversarial attack—the Kaleidoscopic Background Attack—which synthesizes structured adversarial backgrounds using multi-fold radial symmetry. These backgrounds are constructed via segmented disk designs to enhance cross-view appearance consistency. To improve attack efficacy, we introduce a projection-direction consistency loss that enforces geometric symmetry and projection invariance of textures in image space. Experiments demonstrate that our method significantly degrades the accuracy of mainstream pose estimators—including PoseNet and EPnP+RANSAC—increasing mean rotational error by up to 3.2× on ScanNet and 7Scenes benchmarks. This highlights the strong disruptive effect of background symmetry on pose learning and exposes the vulnerability of current methods to background prior modeling.
📝 Abstract
Camera pose estimation is a fundamental computer vision task that is essential for applications like visual localization and multi-view stereo reconstruction. In the object-centric scenarios with sparse inputs, the accuracy of pose estimation can be significantly influenced by background textures that occupy major portions of the images across different viewpoints. In light of this, we introduce the Kaleidoscopic Background Attack (KBA), which uses identical segments to form discs with multi-fold radial symmetry. These discs maintain high similarity across different viewpoints, enabling effective attacks on pose estimation models even with natural texture segments. Additionally, a projected orientation consistency loss is proposed to optimize the kaleidoscopic segments, leading to significant enhancement in the attack effectiveness. Experimental results show that optimized adversarial kaleidoscopic backgrounds can effectively attack various camera pose estimation models.