π€ AI Summary
This work addresses the limitations of vision-dominated strategies in contact-rich manipulation tasks, which struggle to effectively integrate high-frequency force/torque feedback and lack explicit scheduling of when, how strongly, and where to apply forces across task phases. To overcome these challenges, we propose PhaForceβa framework that explicitly models task phases and contact states through a unified contact-phase scheduling mechanism. It decouples low-frequency visuo-force fused planning from high-frequency residual correction and enables phase-aware fine-tuning within an interpretable subspace. The system comprises a Contact-Aware Phase predictor (CAP), a dual-gated slow diffusion planner, and a control-rate phase-routed fast corrector. Evaluated on multiple real-world robotic tasks, PhaForce achieves an average success rate of 86%, outperforming baselines by 40 percentage points, significantly enhancing contact quality and demonstrating strong robustness to out-of-distribution geometric variations.
π Abstract
Contact-rich manipulation requires not only vision-dominant task semantics but also closed-loop reactions to force/torque (F/T) transients. Yet, generative visuomotor policies are typically constrained to low-frequency updates due to inference latency and action chunking, underutilizing F/T for control-rate feedback. Furthermore, existing force-aware methods often inject force continuously and indiscriminately, lacking an explicit mechanism to schedule when / how much / where to apply force across different task phases. We propose PhaForce, a phase-scheduled visual--force policy that coordinates low-rate chunk-level planning and high-rate residual correction via a unified contact/phase schedule. PhaForce comprises (i) a contact-aware phase predictor (CAP) that estimates contact probability and phase belief, (ii) a Slow diffusion planner that performs dual-gated visual--force fusion with orthogonal residual injection to preserve vision semantics while conditioning on force, and (iii) a Fast corrector that applies control-rate phase-routed residuals in interpretable corrective subspaces for within-chunk micro-adjustments. Across multiple real-robot contact-rich tasks, PhaForce achieves an average success rate of 86% (+40 pp over baselines), while also substantially improving contact quality by regulating interaction forces and exhibiting robust adaptability to OOD geometric shifts.