🤖 AI Summary
Existing C++ bundle adjustment (BA) libraries—such as GTSAM, g2o, and Ceres—lack native interoperability with deep learning frameworks like PyTorch, hindering end-to-end differentiability and GPU acceleration. To address this, we propose the first sparse BA framework natively supporting PyTorch’s eager-mode execution. Our method tightly integrates SE(3) Lie group parameterization, CUDA-accelerated sparse linear algebra, automatic differentiation, and second-order differentiable Levenberg–Marquardt optimization, with built-in conjugate gradient and Cholesky solvers. The framework enables fully differentiable, GPU-accelerated joint optimization of camera poses and 3D points. Evaluated on SLAM, augmented reality, and photogrammetry tasks, it achieves 18.5×–23× speedup over mainstream C++ libraries on GPU hardware. This advancement significantly improves development, training, and debugging efficiency for deep learning–geometric hybrid systems.
📝 Abstract
Bundle adjustment (BA) is a critical technique in various robotic applications, such as simultaneous localization and mapping (SLAM), augmented reality (AR), and photogrammetry. BA optimizes parameters such as camera poses and 3D landmarks to align them with observations. With the growing importance of deep learning in perception systems, there is an increasing need to integrate BA with deep learning frameworks for enhanced reliability and performance. However, widely-used C++-based BA frameworks, such as GTSAM, g$^2$o, and Ceres, lack native integration with modern deep learning libraries like PyTorch. This limitation affects their flexibility, adaptability, ease of debugging, and overall implementation efficiency. To address this gap, we introduce an eager-mode BA framework seamlessly integrated with PyPose, providing PyTorch-compatible interfaces with high efficiency. Our approach includes GPU-accelerated, differentiable, and sparse operations designed for 2nd-order optimization, Lie group and Lie algebra operations, and linear solvers. Our eager-mode BA on GPU demonstrates substantial runtime efficiency, achieving an average speedup of 18.5$ imes$, 22$ imes$, and 23$ imes$ compared to GTSAM, g$^2$o, and Ceres, respectively.