π€ AI Summary
This work addresses the challenge of short-term 6D object pose tracking from monocular RGB images in dynamic scenes. We propose an Uncertainty-aware Keypoint Refinement Network (UKRN) and introduce OmniPose6Dβthe first large-scale synthetic dataset and benchmark specifically designed for dynamic-scene pose estimation. Our method integrates Bayesian neural networks to model keypoint localization uncertainty, enabling iterative, geometry-consistent refinement that significantly improves robustness under motion blur and occlusion. Key contributions include: (1) a novel synthetic data generation paradigm supporting high-speed motion and complex object interactions; and (2) a probabilistic keypoint optimization mechanism that overcomes accuracy bottlenecks in monocular pose estimation. Evaluated on real-world benchmarks, our approach reduces pose error by 18.7% over state-of-the-art methods, demonstrating the critical impact of high-fidelity synthetic data and explicit uncertainty modeling on dynamic pose tracking performance.
π Abstract
To address the challenge of short-term object pose tracking in dynamic environments with monocular RGB input, we introduce a large-scale synthetic dataset OmniPose6D, crafted to mirror the diversity of real-world conditions. We additionally present a benchmarking framework for a comprehensive comparison of pose tracking algorithms. We propose a pipeline featuring an uncertainty-aware keypoint refinement network, employing probabilistic modeling to refine pose estimation. Comparative evaluations demonstrate that our approach achieves performance superior to existing baselines on real datasets, underscoring the effectiveness of our synthetic dataset and refinement technique in enhancing tracking precision in dynamic contexts. Our contributions set a new precedent for the development and assessment of object pose tracking methodologies in complex scenes.