Fast Visuomotor Policies via Partial Denoising

📅 2025-03-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion-based policies suffer from real-time inference bottlenecks in vision–motor control due to multi-step denoising. To address this, we propose Falcon—a training-free acceleration algorithm. Falcon introduces a novel partial-denoising initialization mechanism, relaxing the standard Gaussian prior constraint by leveraging temporal dependencies in action sequences. It conditionally fuses historical denoising states with current observations, enabling zero-shot inference optimization without retraining. The method is plug-and-play and fully compatible with existing acceleration techniques. Evaluated across 46 simulated environments, Falcon achieves 2×–7× inference speedup while preserving near-identical policy performance (performance degradation is negligible). This substantial acceleration significantly enhances applicability in long-horizon planning and real-time decision-making scenarios.

Technology Category

Application Category

📝 Abstract
Diffusion policies are widely adopted in complex visuomotor tasks for their ability to capture multimodal action distributions. However, the multiple sampling steps required for action generation significantly harm real-time inference efficiency, which limits their applicability in long-horizon tasks and real-time decision-making scenarios. Existing acceleration techniques reduce sampling steps by approximating the original denoising process but inevitably introduce unacceptable performance loss. Here we propose Falcon, which mitigates this trade-off and achieves further acceleration. The core insight is that visuomotor tasks exhibit sequential dependencies between actions at consecutive time steps. Falcon leverages this property to avoid denoising from a standard normal distribution at each decision step. Instead, it starts denoising from partial denoised actions derived from historical information to significantly reduce the denoising steps while incorporating current observations to achieve performance-preserving acceleration of action generation. Importantly, Falcon is a training-free algorithm that can be applied as a plug-in to further improve decision efficiency on top of existing acceleration techniques. We validated Falcon in 46 simulated environments, demonstrating a 2-7x speedup with negligible performance degradation, offering a promising direction for efficient visuomotor policy design.
Problem

Research questions and friction points this paper is trying to address.

Improves real-time inference efficiency in visuomotor tasks.
Reduces denoising steps without performance loss.
Enables faster decision-making in long-horizon scenarios.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Partial denoising from historical actions
Training-free plug-in acceleration algorithm
Reduces denoising steps with minimal performance loss
🔎 Similar Papers
No similar papers found.
Haojun Chen
Haojun Chen
Peking University
Reinforcement LearningEmbodied AI
M
Minghao Liu
School of Computer Science, Peking University
Xiaojian Ma
Xiaojian Ma
University of California, Los Angeles
Computer VisionMachine LearningGenerative ModelingReinforcement Learning
Zailin Ma
Zailin Ma
School of Mathematical Sciences, Peking University
H
Huimin Wu
National Key Laboratory of General Artificial Intelligence, BIGAI
Chengdong Ma
Chengdong Ma
Peking University
Reinforcement LearningMulti-Agent Systems
Yuanpei Chen
Yuanpei Chen
South China University of Technology
Robotic
Yifan Zhong
Yifan Zhong
Peking University
VLA ModelsDexterous ManipulationReinforcement Learning
M
Mingzhi Wang
Institute for Artificial Intelligence, Peking University
Q
Qing Li
National Key Laboratory of General Artificial Intelligence, BIGAI
Y
Yaodong Yang
Institute for Artificial Intelligence, Peking University