π€ AI Summary
Quantifying uncertainty in 6D object pose estimation from monocular vision is critical for robust robotic control, yet existing methods often rely on strong distributional assumptions. This paper proposes SLUEβthe first initialization-free, distribution-agnostic pose uncertainty quantification method that rigorously guarantees containment of the true pose within an ellipsoidal confidence region. SLUE leverages bounded pixel noise on 2D semantic keypoints and employs the S-lemma to convexify nonconvex geometric constraints. It then applies a sum-of-squares (SOS) hierarchy relaxation to approximate the minimum-volume enclosing ellipsoid, enabling strict propagation of uncertainty from pixel-level errors to 3D pose space. Evaluated on two standard benchmarks and a real-world UAV tracking scenario, SLUE significantly tightens translational uncertainty bounds while maintaining competitive rotational bounds, and supports projection onto decoupled translation and axis-angle representations.
π Abstract
Quantifying the uncertainty of an object's pose estimate is essential for robust control and planning. Although pose estimation is a well-studied robotics problem, attaching statistically rigorous uncertainty is not well understood without strict distributional assumptions. We develop distribution-free pose uncertainty bounds about a given pose estimate in the monocular setting. Our pose uncertainty only requires high probability noise bounds on pixel detections of 2D semantic keypoints on a known object. This noise model induces an implicit, non-convex set of pose uncertainty constraints. Our key contribution is SLUE (S-Lemma Uncertainty Estimation), a convex program to reduce this set to a single ellipsoidal uncertainty bound that is guaranteed to contain the true object pose with high probability. SLUE solves a relaxation of the minimum volume bounding ellipsoid problem inspired by the celebrated S-lemma. It requires no initial guess of the bound's shape or size and is guaranteed to contain the true object pose with high probability. For tighter uncertainty bounds at the same confidence, we extend SLUE to a sum-of-squares relaxation hierarchy which is guaranteed to converge to the minimum volume ellipsoidal uncertainty bound for a given set of keypoint constraints. We show this pose uncertainty bound can easily be projected to independent translation and axis-angle orientation bounds. We evaluate SLUE on two pose estimation datasets and a real-world drone tracking scenario. Compared to prior work, SLUE generates substantially smaller translation bounds and competitive orientation bounds. We release code at https://github.com/MIT-SPARK/PoseUncertaintySets.