๐ค AI Summary
This work addresses the challenge of inaccurate axis estimation in articulated object manipulation. We propose a closed-loop perceptionโaction framework that excites object motion via interactive micro-movements and captures dynamic 3D point clouds. For the first time, we integrate SAM2 for online segmentation of moving parts; combined with mask-based feature extraction and PCA, it enables robust, real-time axis estimation, which directly guides robotic actuation in a feedback loop. Our key contributions are: (1) overcoming the limitations of conventional open-loop approaches by explicitly modeling dynamic interaction; (2) pioneering the use of SAM2 for online axis estimation of articulated structures; and (3) establishing an end-to-end closed-loop system integrating perception, segmentation, geometric fitting, and control. Simulation results demonstrate significant improvements in manipulation accuracy and success rate over state-of-the-art baselines on axis-constrained tasks, including door handle rotation and drawer pulling.
๐ Abstract
Articulated object manipulation requires precise object interaction, where the object's axis must be carefully considered. Previous research employed interactive perception for manipulating articulated objects, but typically, open-loop approaches often suffer from overlooking the interaction dynamics. To address this limitation, we present a closed-loop pipeline integrating interactive perception with online axis estimation from segmented 3D point clouds. Our method leverages any interactive perception technique as a foundation for interactive perception, inducing slight object movement to generate point cloud frames of the evolving dynamic scene. These point clouds are then segmented using Segment Anything Model 2 (SAM2), after which the moving part of the object is masked for accurate motion online axis estimation, guiding subsequent robotic actions. Our approach significantly enhances the precision and efficiency of manipulation tasks involving articulated objects. Experiments in simulated environments demonstrate that our method outperforms baseline approaches, especially in tasks that demand precise axis-based control. Project Page: https://hytidel.github.io/video-tracking-for-axis-estimation/.