DiSPo: Diffusion-SSM based Policy Learning for Coarse-to-Fine Action Discretization

📅 2024-09-23

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses the challenge of learning fine-grained manipulation skills from coarse-grained demonstrations. We propose a granularity-adaptable, memory-efficient action generation framework. Methodologically, we introduce the first integration of diffusion models with state-space models (Mamba) and design a step-scaling mechanism, enabling dynamic adjustment of action precision in end-to-end imitation learning—without requiring fine-grained annotations or external interpolation models. Our contributions are threefold: (1) continuous adjustability of action generation granularity; (2) significant improvements in memory efficiency and inference speed; and (3) state-of-the-art success rates—up to 81% higher than prior methods—across three “coarse-to-fine” benchmark tasks. We further validate cross-scale action generalization on both simulated and real-world robotic manipulation tasks.

Technology Category

Application Category

📝 Abstract

We aim to solve the problem of generating coarse-to-fine skills learning from demonstrations (LfD). To scale precision, traditional LfD approaches often rely on extensive fine-grained demonstrations with external interpolations or dynamics models with limited generalization capabilities. For memory-efficient learning and convenient granularity change, we propose a novel diffusion-SSM based policy (DiSPo) that learns from diverse coarse skills and produces varying control scales of actions by leveraging a state-space model, Mamba. Our evaluations show the adoption of Mamba and the proposed step-scaling method enable DiSPo to outperform in three coarse-to-fine benchmark tests with maximum 81% higher success rate than baselines. In addition, DiSPo improves inference efficiency by generating coarse motions in less critical regions. We finally demonstrate the scalability of actions with simulation and real-world manipulation tasks.

Problem

Research questions and friction points this paper is trying to address.

Generating coarse-to-fine skills from demonstrations efficiently

Overcoming limited generalization in traditional action discretization methods

Improving inference efficiency with scalable action granularity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Diffusion-SSM for policy learning

Leverages Mamba for action scaling

Improves efficiency with coarse motions

🔎 Similar Papers

Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation