Two-Steps Diffusion Policy for Robotic Manipulation via Genetic Denoising

📅 2025-10-24

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Diffusion models achieve state-of-the-art performance in robot manipulation, yet their denoising mechanism—designed for high-dimensional visual data—fails to exploit the low-dimensional, structured nature of action distributions, resulting in inefficient inference (high NFE). To address this, we propose a two-stage genetic denoising strategy tailored for embodied intelligence: first, population-based evolutionary search identifies denoising trajectories with low out-of-distribution risk; second, refinement sampling is performed exclusively on the low-dimensional action manifold. Our method achieves stable control with only two neural network evaluations, drastically reducing computational overhead. Evaluated across 14 robotic manipulation tasks and over 2 million trials, it improves average performance by 20% over baselines, substantially reduces inference steps, and matches or exceeds SOTA accuracy. The core contribution is the novel coupling of evolutionary search with diffusion denoising—enabling the first efficient, structure-aware denoising explicitly designed for action distributions.

Technology Category

Application Category

📝 Abstract

Diffusion models, such as diffusion policy, have achieved state-of-the-art results in robotic manipulation by imitating expert demonstrations. While diffusion models were originally developed for vision tasks like image and video generation, many of their inference strategies have been directly transferred to control domains without adaptation. In this work, we show that by tailoring the denoising process to the specific characteristics of embodied AI tasks -- particularly structured, low-dimensional nature of action distributions -- diffusion policies can operate effectively with as few as 5 neural function evaluations (NFE). Building on this insight, we propose a population-based sampling strategy, genetic denoising, which enhances both performance and stability by selecting denoising trajectories with low out-of-distribution risk. Our method solves challenging tasks with only 2 NFE while improving or matching performance. We evaluate our approach across 14 robotic manipulation tasks from D4RL and Robomimic, spanning multiple action horizons and inference budgets. In over 2 million evaluations, our method consistently outperforms standard diffusion-based policies, achieving up to 20% performance gains with significantly fewer inference steps.

Problem

Research questions and friction points this paper is trying to address.

Optimizing diffusion models for robotic manipulation efficiency

Reducing neural evaluations in action distribution denoising

Enhancing policy performance with genetic sampling techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Genetic denoising enhances diffusion policy performance

Method reduces neural evaluations to two steps

Tailored denoising for robotic action distributions

🔎 Similar Papers

ManiCM: Real-time 3D Diffusion Policy via Consistency Model for Robotic Manipulation