🤖 AI Summary
To address hardware-ID dependency, occlusion sensitivity, and catastrophic failure from erroneous visual matches in multi-robot visual relative localization, this paper proposes a marker-free, occlusion-robust end-to-end learning framework. It tightly integrates Graph Neural Networks (GNNs) with differentiable Pose Graph Optimization (PGO) to jointly perform robust data association between UWB ranging and visual detections, and estimate relative poses. The framework supports fully decentralized deployment, eliminating reliance on hardware identifiers and manual hyperparameter tuning. Its key innovation lies in unifying GNN-based matching and differentiable PGO into a single trainable paradigm, simultaneously outputting three calibrated outputs: correspondence hypotheses, pose initializations, and uncertainty estimates. Extensive evaluations—across simulation, real-world experiments, varying robot counts, and dynamic occlusion scenarios—demonstrate substantial improvements over state-of-the-art methods in both accuracy and robustness. The implementation is publicly available.
📝 Abstract
Ultra-wideband (UWB)-vision fusion localization has achieved extensive applications in the domain of multi-agent relative localization. The challenging matching problem between robots and visual detection renders existing methods highly dependent on identity-encoded hardware or delicate tuning algorithms. Overconfident yet erroneous matches may bring about irreversible damage to the localization system. To address this issue, we introduce Mr. Virgil, an end-to-end learning multi-robot visual-range relative localization framework, consisting of a graph neural network for data association between UWB rangings and visual detections, and a differentiable pose graph optimization (PGO) back-end. The graph-based front-end supplies robust matching results, accurate initial position predictions, and credible uncertainty estimates, which are subsequently integrated into the PGO back-end to elevate the accuracy of the final pose estimation. Additionally, a decentralized system is implemented for real-world applications. Experiments spanning varying robot numbers, simulation and real-world, occlusion and non-occlusion conditions showcase the stability and exactitude under various scenes compared to conventional methods. Our code is available at: https://github.com/HiOnes/Mr-Virgil.