🤖 AI Summary
This work proposes a modular, uncertainty-aware framework for pose estimation that significantly enhances robustness in complex environments where conventional methods suffer from multiple error sources, high computational cost, and poor reliability. By introducing an error provenance mechanism into the pose estimation pipeline for the first time, the approach decomposes the task into three sequential stages: failure detection, error attribution, and targeted recovery, activating lightweight corrective strategies only when necessary. Built upon an efficient iterative closest point (ICP)-based estimator, the method integrates uncertainty modeling with interpretable attribution to achieve substantial robustness gains in real-world robotic grasping tasks. It matches the performance of large foundation models while maintaining lower computational complexity and faster inference speed.
📝 Abstract
Robust estimation of object poses in robotic manipulation is often addressed using foundational general estimators, that aim to handle diverse error sources naively within a single model. Still, they struggle due to environmental uncertainties, while requiring long inference times and heavy computation. In contrast, we propose a modular, uncertainty-aware framework that attributes pose estimation errors to specific error sources and applies targeted mitigation strategies only when necessary. Instantiated with Iterative Closest Point (ICP) as a simple and lightweight pose estimator, we leverage our framework for real-world robotic grasping tasks. By decomposing pose estimation into failure detection, error attribution, and targeted recovery, we significantly improve the robustness of ICP and achieve competitive performance compared to foundation models, while relying on a substantially simpler and faster pose estimator.