Learn from What We HAVE: History-Aware VErifier that Reasons about Past Interactions Online

📅 2025-08-29

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

Robots struggle to predict manipulation outcomes in visually ambiguous scenes; while generative models (e.g., diffusion models) offer theoretical suitability, their practical performance is hampered by insufficient exploitation of historical interaction data. To address this, we propose a decoupled “generation–verification” framework: an unconditional diffusion model first samples multiple candidate actions, and a history-aware verifier—explicitly modeling past interaction sequences—then evaluates, filters, and re-ranks these candidates. We theoretically prove that this verification mechanism improves the expected quality of selected actions. Our method integrates multimodal perception and online learning. Extensive experiments on simulated and real-world tasks—including articulated object manipulation, multimodal door opening, and uneven surface grasping—demonstrate significant improvements over state-of-the-art baselines. Results validate that history-guided verification is critical for robust manipulation under visual ambiguity.

Technology Category

Application Category

📝 Abstract

We introduce a novel History-Aware VErifier (HAVE) to disambiguate uncertain scenarios online by leveraging past interactions. Robots frequently encounter visually ambiguous objects whose manipulation outcomes remain uncertain until physically interacted with. While generative models alone could theoretically adapt to such ambiguity, in practice they obtain suboptimal performance in ambiguous cases, even when conditioned on action history. To address this, we propose explicitly decoupling action generation from verification: we use an unconditional diffusion-based generator to propose multiple candidate actions and employ our history-aware verifier to select the most promising action by reasoning about past interactions. Through theoretical analysis, we demonstrate that employing a verifier significantly improves expected action quality. Empirical evaluations and analysis across multiple simulated and real-world environments including articulated objects, multi-modal doors, and uneven object pick-up confirm the effectiveness of our method and improvements over baselines. Our project website is available at: https://liy1shu.github.io/HAVE_CoRL25/

Problem

Research questions and friction points this paper is trying to address.

Disambiguating uncertain visual scenarios online

Improving robot manipulation outcomes with past interactions

Selecting optimal actions via history-aware verification

Innovation

Methods, ideas, or system contributions that make the work stand out.

History-aware verifier for online disambiguation

Decoupling action generation from verification process

Diffusion-based generator with history-conditioned selection

🔎 Similar Papers

Bridging Social Media and Search Engines: Dredge Words and the Detection of Unreliable Domains