FlowBotHD: History-Aware Diffuser Handling Ambiguities in Articulated Objects Manipulation

📅 2024-10-09

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Existing methods struggle to reliably infer manipulation modes—such as push/pull and left/right side—for hinge-like articulated objects under occlusion, symmetry, or visual ambiguity. To address this, we propose the History-Aware Diffusion Network (HADN), the first approach to integrate diffusion models into multimodal articulated motion pattern modeling. HADN encodes temporal observations to incorporate historical context and formulates a conditional generative framework wherein the multimodal (i.e., multi-modal-peak) distribution of manipulation modes is explicitly modeled during iterative denoising. This design significantly enhances prediction stability and robustness under occlusion. Evaluated on standard articulated object manipulation benchmarks, HADN achieves state-of-the-art performance, improving average manipulation success rate by 12.7% over prior methods—with particularly pronounced gains under severe occlusion conditions.

Technology Category

Application Category

📝 Abstract

We introduce a novel approach for manipulating articulated objects which are visually ambiguous, such doors which are symmetric or which are heavily occluded. These ambiguities can cause uncertainty over different possible articulation modes: for instance, when the articulation direction (e.g. push, pull, slide) or location (e.g. left side, right side) of a fully closed door are uncertain, or when distinguishing features like the plane of the door are occluded due to the viewing angle. To tackle these challenges, we propose a history-aware diffusion network that can model multi-modal distributions over articulation modes for articulated objects; our method further uses observation history to distinguish between modes and make stable predictions under occlusions. Experiments and analysis demonstrate that our method achieves state-of-art performance on articulated object manipulation and dramatically improves performance for articulated objects containing visual ambiguities. Our project website is available at https://flowbothd.github.io/.

Problem

Research questions and friction points this paper is trying to address.

Operational Uncertainty

Complex Objects Manipulation

Occlusion Handling

Innovation

Methods, ideas, or system contributions that make the work stand out.

FlowBotHD

Multi-Modal Object Manipulation

Occlusion Robustness

🔎 Similar Papers

What Foundation Models can Bring for Robot Learning in Manipulation : A Survey