🤖 AI Summary
This work addresses the lack of real-time safety guarantees in deformable object manipulation, where existing approaches relying on reward shaping often fail to enforce safety constraints during deployment. The authors propose an online safety filter that applies minimal corrections to any nominal policy via lightweight quadratic programming, thereby enforcing task-level safety constraints in real time. By integrating horizon-agnostic neural operators with boundary control barrier functions, the method certifies safety at the output level and generalizes across variable trajectory lengths without requiring retraining. Evaluated on FluidLab fluid manipulation tasks, the approach increases the rate of safe trajectories by up to 22% compared to unfiltered policies and achieves faster convergence to the safe set, demonstrating its efficiency and reliability.
📝 Abstract
Safety critical control of robotic manipulation tasks involving deformable media such as fluids, cloth, and soft objects remains challenging because existing learning based approaches encode safety indirectly through reward shaping, which provides no guarantee of constraint satisfaction at deployment. We present a constraint driven online safety filter for deformable object manipulation that enforces explicit task level safety constraints in real time by minimally modifying any nominal control policy. Our approach combines two key components: a horizon agnostic neural operator that learns the boundary input output mapping of the underlying PDE dynamics and generalizes across variable rollout lengths without retraining, and a boundary control barrier function that certifies safety at the task relevant output level via a lightweight quadratic program. The resulting safety constraint is affine in the boundary input rate, enabling real time online filtering. We evaluate the proposed method on fluid manipulation tasks in FluidLab, where the filter improves safe trajectory rates by up to 22% over unfiltered base policies while also reducing the number of steps required to reach the safe set, demonstrating that constraint driven safety enforcement is both more reliable and more efficient than reward shaping approaches.