Relative Pose Regression with Pose Auto-Encoders: Enhancing Accuracy and Data Efficiency for Retail Applications

📅 2025-08-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the dual requirements of data efficiency and localization accuracy for camera pose estimation in retail environments, this paper proposes a relative pose regression method based on a Pose Autoencoder (PAE). Methodologically, we are the first to adapt PAE to relative pose estimation, jointly modeling scene spatial priors and visual features within an end-to-end framework that performs both relative pose regression and absolute pose refinement. We further design a lightweight relocalization strategy requiring no additional storage of images or poses. Our key contributions are: (1) a learnable pose prior representation mechanism; and (2) highly efficient pose refinement achieving state-of-the-art performance using only 30% of the training data. Evaluated on indoor benchmarks, our approach significantly improves localization accuracy while substantially reducing data collection costs for retail deployment.

Technology Category

Application Category

📝 Abstract
Accurate camera localization is crucial for modern retail environments, enabling enhanced customer experiences, streamlined inventory management, and autonomous operations. While Absolute Pose Regression (APR) from a single image offers a promising solution, approaches that incorporate visual and spatial scene priors tend to achieve higher accuracy. Camera Pose Auto-Encoders (PAEs) have recently been introduced to embed such priors into APR. In this work, we extend PAEs to the task of Relative Pose Regression (RPR) and propose a novel re-localization scheme that refines APR predictions using PAE-based RPR, without requiring additional storage of images or pose data. We first introduce PAE-based RPR and establish its effectiveness by comparing it with image-based RPR models of equivalent architectures. We then demonstrate that our refinement strategy, driven by a PAE-based RPR, enhances APR localization accuracy on indoor benchmarks. Notably, our method is shown to achieve competitive performance even when trained with only 30% of the data, substantially reducing the data collection burden for retail deployment. Our code and pre-trained models are available at: https://github.com/yolish/camera-pose-auto-encoders
Problem

Research questions and friction points this paper is trying to address.

Enhancing camera localization accuracy for retail applications
Reducing data collection burden with efficient pose regression
Improving re-localization using auto-encoder priors without extra storage
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends Pose Auto-Encoders to Relative Pose Regression
Refines APR predictions using PAE-based RPR
Achieves competitive performance with 30% data
🔎 Similar Papers
No similar papers found.