š¤ AI Summary
To address the challenge of real-time, collision-free robotic arm motion planning in dynamic and partially observable environments, this paper proposes a visionāmotor neural policy that directly processes raw point clouds to generate reactive, low-latency trajectories. Methodologically, we pretrain the IMPACT foundation model on millions of expert demonstrations from simulation, integrating a point-cloud encoder, Transformer architecture, and iterative teacherāstudent fine-tuning. We further introduce a Dynamic Control PointāReciprocal Motion Planner (DCP-RMP) module to enable online target re-planning during inference. Compared to conventional plannersāreliant on global environment modeling and suffering from high computational latencyāand existing neural policiesālimited in generalizationāour approach achieves significant improvements in success rate, robustness, and cross-environment generalization, both in simulation and on physical hardware. To our knowledge, this is the first end-to-end dynamic obstacle avoidance framework delivering high-precision control with sub-100ms latency and strong adaptability to unseen dynamics.
š Abstract
Generating collision-free motion in dynamic, partially observable environments is a fundamental challenge for robotic manipulators. Classical motion planners can compute globally optimal trajectories but require full environment knowledge and are typically too slow for dynamic scenes. Neural motion policies offer a promising alternative by operating in closed-loop directly on raw sensory inputs but often struggle to generalize in complex or dynamic settings. We propose Deep Reactive Policy (DRP), a visuo-motor neural motion policy designed for reactive motion generation in diverse dynamic environments, operating directly on point cloud sensory input. At its core is IMPACT, a transformer-based neural motion policy pretrained on 10 million generated expert trajectories across diverse simulation scenarios. We further improve IMPACT's static obstacle avoidance through iterative student-teacher finetuning. We additionally enhance the policy's dynamic obstacle avoidance at inference time using DCP-RMP, a locally reactive goal-proposal module. We evaluate DRP on challenging tasks featuring cluttered scenes, dynamic moving obstacles, and goal obstructions. DRP achieves strong generalization, outperforming prior classical and neural methods in success rate across both simulated and real-world settings. Video results and code available at https://deep-reactive-policy.com