🤖 AI Summary
Diffusion-based policies suffer from real-time inference bottlenecks in vision–motor control due to multi-step denoising. To address this, we propose Falcon—a training-free acceleration algorithm. Falcon introduces a novel partial-denoising initialization mechanism, relaxing the standard Gaussian prior constraint by leveraging temporal dependencies in action sequences. It conditionally fuses historical denoising states with current observations, enabling zero-shot inference optimization without retraining. The method is plug-and-play and fully compatible with existing acceleration techniques. Evaluated across 46 simulated environments, Falcon achieves 2×–7× inference speedup while preserving near-identical policy performance (performance degradation is negligible). This substantial acceleration significantly enhances applicability in long-horizon planning and real-time decision-making scenarios.
📝 Abstract
Diffusion policies are widely adopted in complex visuomotor tasks for their ability to capture multimodal action distributions. However, the multiple sampling steps required for action generation significantly harm real-time inference efficiency, which limits their applicability in long-horizon tasks and real-time decision-making scenarios. Existing acceleration techniques reduce sampling steps by approximating the original denoising process but inevitably introduce unacceptable performance loss. Here we propose Falcon, which mitigates this trade-off and achieves further acceleration. The core insight is that visuomotor tasks exhibit sequential dependencies between actions at consecutive time steps. Falcon leverages this property to avoid denoising from a standard normal distribution at each decision step. Instead, it starts denoising from partial denoised actions derived from historical information to significantly reduce the denoising steps while incorporating current observations to achieve performance-preserving acceleration of action generation. Importantly, Falcon is a training-free algorithm that can be applied as a plug-in to further improve decision efficiency on top of existing acceleration techniques. We validated Falcon in 46 simulated environments, demonstrating a 2-7x speedup with negligible performance degradation, offering a promising direction for efficient visuomotor policy design.