A Novel Task-Driven Diffusion-Based Policy with Affordance Learning for Generalizable Manipulation of Articulated Objects

📅 2025-09-18

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Weak cross-category generalization and decoupled task understanding from interaction localization hinder dexterous manipulation of articulated objects. To address this, we propose DART—a framework that jointly integrates linear temporal logic (LTL)-driven task semantic parsing, manipulability-aware learning for critical interaction point localization, and diffusion-based policy optimization grounded in real-world interaction data. By unifying task logic, object geometry-dynamics manipulability, and policy generation into a single model, DART significantly improves transferable reasoning, success rates, and environmental robustness on unseen articulated object categories. Experiments demonstrate that DART outperforms state-of-the-art methods across generalization, sample efficiency, and multi-task adaptability. It establishes a novel, interpretable, and scalable paradigm for general-purpose embodied manipulation.

Technology Category

Application Category

📝 Abstract

Despite recent advances in dexterous manipulations, the manipulation of articulated objects and generalization across different categories remain significant challenges. To address these issues, we introduce DART, a novel framework that enhances a diffusion-based policy with affordance learning and linear temporal logic (LTL) representations to improve the learning efficiency and generalizability of articulated dexterous manipulation. Specifically, DART leverages LTL to understand task semantics and affordance learning to identify optimal interaction points. The {diffusion-based policy} then generalizes these interactions across various categories. Additionally, we exploit an optimization method based on interaction data to refine actions, overcoming the limitations of traditional diffusion policies that typically rely on offline reinforcement learning or learning from demonstrations. Experimental results demonstrate that DART outperforms most existing methods in manipulation ability, generalization performance, transfer reasoning, and robustness. For more information, visit our project website at: https://sites.google.com/view/dart0257/.

Problem

Research questions and friction points this paper is trying to address.

Generalizing manipulation across articulated object categories

Improving learning efficiency with affordance and LTL

Overcoming limitations of traditional diffusion policies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion-based policy with affordance learning

LTL representations for task semantics

Optimization method refining actions from data

🔎 Similar Papers

What Foundation Models can Bring for Robot Learning in Manipulation : A Survey