🤖 AI Summary
This paper addresses the problem of multi-view relative pose estimation under known vertical direction (e.g., provided by an IMU). To reduce correspondence requirements, it leverages vertical prior knowledge to constrain the pose space to two rotational degrees of freedom and two translational components. The method introduces the first minimal (three-point) and linear closed-form (four-point) solutions for trifocal pose estimation. It integrates trifocal tensor geometry, Gröbner basis algebraic solving, and linear least-squares refinement—achieving high accuracy while significantly improving RANSAC robustness and computational efficiency. Experiments on the KITTI and synthetic datasets demonstrate that the proposed approach outperforms state-of-the-art methods in pose estimation accuracy.
📝 Abstract
This work presents two novel solvers for estimating the relative poses among views with known vertical directions. The vertical directions of camera views can be easily obtained using inertial measurement units (IMUs) which have been widely used in autonomous vehicles, mobile phones, and unmanned aerial vehicles (UAVs). Given the known vertical directions, our lgorithms only need to solve for two rotation angles and two translation vectors. In this paper, a linear closed-form solution has been described, requiring only four point correspondences in three views. We also propose a minimal solution with three point correspondences using the latest Gröbner basis solver. Since the proposed methods require fewer point correspondences, they can be efficiently applied within the RANSAC framework for outliers removal and pose estimation in visual odometry. The proposed method has been tested on both synthetic data and real-world scenes from KITTI. The experimental results show that the accuracy of the estimated poses is superior to other alternative methods.