🤖 AI Summary
Diffusion planners suffer from plan incoherence and poor scalability in long-horizon tasks due to decoupled high-level subgoal selection and low-level trajectory generation. This paper proposes the Coupled Hierarchical Diffusion Framework (CHDF), the first approach to enable feedback-driven self-correction of low-level trajectories toward high-level subgoals via a shared conditional classifier, establishing a tightly coupled hierarchical joint sampling mechanism. CHDF unifies high-level semantic goal reasoning and low-level motion trajectory modeling through backward gradient guidance and collaborative optimization. Evaluated on long-horizon tasks—including maze navigation, tabletop manipulation, and household environments—CHDF significantly outperforms tiled and conventional hierarchical diffusion baselines: achieving an average 23.6% improvement in task success rate and a 31.4% increase in trajectory consistency. This work overcomes the scalability bottleneck of diffusion models in complex, long-horizon planning.
📝 Abstract
Diffusion-based planners have shown strong performance in short-horizon tasks but often fail in complex, long-horizon settings. We trace the failure to loose coupling between high-level (HL) sub-goal selection and low-level (LL) trajectory generation, which leads to incoherent plans and degraded performance. We propose Coupled Hierarchical Diffusion (CHD), a framework that models HL sub-goals and LL trajectories jointly within a unified diffusion process. A shared classifier passes LL feedback upstream so that sub-goals self-correct while sampling proceeds. This tight HL-LL coupling improves trajectory coherence and enables scalable long-horizon diffusion planning. Experiments across maze navigation, tabletop manipulation, and household environments show that CHD consistently outperforms both flat and hierarchical diffusion baselines.