CoFreeVLA: Collision-Free Dual-Arm Manipulation via Vision-Language-Action Model and Risk Estimation

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the safety risks in dual-arm manipulation under vision-language instructions, where unmodeled self-collisions can lead to collisions between robot arms or with objects. The authors propose a risk-aware, end-to-end Vision-Language-Action (VLA) framework that integrates, for the first time, a short-term self-collision risk estimator with a VLA model. By fusing proprioceptive and visual embeddings, the system dynamically predicts high-risk commands and halts their execution in real time. Additionally, it incorporates a risk-guided state recovery and policy optimization mechanism. The method is pretrained in simulation using collision labels and fine-tuned on a real PiPER dual-arm robot. Experiments across five dual-arm tasks demonstrate a significant reduction in self-collision rates and higher task success compared to existing approaches such as RDT and APEX.

Technology Category

Application Category

📝 Abstract
Vision Language Action (VLA) models enable instruction following manipulation, yet dualarm deployment remains unsafe due to under modeled selfcollisions between arms and grasped objects. We introduce CoFreeVLA, which augments an endtoend VLA with a short horizon selfcollision risk estimator that predicts collision likelihood from proprioception, visual embeddings, and planned actions. The estimator gates risky commands, recovers to safe states via risk-guided adjustments, and shapes policy refinement for safer rollouts. It is pre-trained with model-based collision labels and posttrained on real robot rollouts for calibration. On five bimanual tasks with the PiPER robot arm, CoFreeVLA reduces selfcollisions and improves success rates versus RDT and APEX.
Problem

Research questions and friction points this paper is trying to address.

dual-arm manipulation
self-collision
Vision-Language-Action model
collision avoidance
bimanual tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-Language-Action Model
Self-collision Avoidance
Risk Estimation
Dual-arm Manipulation
Policy Refinement
🔎 Similar Papers
No similar papers found.
X
Xuanran Zhai
National University of Singapore
B
Binkai Ou
BoardWare Information System Co. Ltd
Y
Yemin Wang
Xiamen University
H
Hui Yi Leong
University of Chicago
Qiaojun Yu
Qiaojun Yu
Shanghai Jiao Tong University, Shanghai AI Lab
robotic learning3D visionvla
Ce Hao
Ce Hao
National University of Singapore
Yaohua Liu
Yaohua Liu
Oak Ridge National Laboratory
Condensed Matter and Materials PhysicsNeutron Instrumentation