🤖 AI Summary
This work proposes a hybrid framework for efficient, safe, and precise motion planning in cluttered environments by integrating object-centric diffusion priors with model predictive control (MPC). The approach employs slot attention to construct compact obstacle representations and conditions a diffusion Transformer on object-centric scene descriptions to generate initial trajectories. These trajectories are subsequently refined through MPC, which enforces collision avoidance via signed distance field constraints and incorporates rigid-body dynamics for real-time feasibility and safety. Experimental results demonstrate that the proposed method significantly outperforms purely sampling-based or single-component baselines on benchmark tasks, achieving higher success rates and lower latency. Real-world validation on a Panda manipulator further confirms its capability for safe, reliable, and real-time obstacle avoidance.
📝 Abstract
Acting in cluttered environments requires predicting and avoiding collisions while still achieving precise control. Conventional optimization-based controllers can enforce physical constraints, but they struggle to produce feasible solutions quickly when many obstacles are present. Diffusion models can generate diverse trajectories around obstacles, yet prior approaches lacked a general and efficient way to condition them on scene structure. In this paper, we show that combining diffusion-based warm-starting conditioned with a latent object-centric representation of the scene and with a collision-aware model predictive controller (MPC) yields reliable and efficient motion generation under strict time limits. Our approach conditions a diffusion transformer on the system state, task, and surroundings, using an object-centric slot attention mechanism to provide a compact obstacle representation suitable for control. The sampled trajectories are refined by an optimal control problem that enforces rigid-body dynamics and signed-distance collision constraints, producing feasible motions in real time. On benchmark tasks, this hybrid method achieved markedly higher success rates and lower latency than sampling-based planners or either component alone. Real-robot experiments with a torque-controlled Panda confirm reliable and safe execution with MPC.