Semantically Safe Robot Manipulation: From Semantic Scene Understanding to Motion Safeguards

📅 2024-10-19
🏛️ IEEE Robotics and Automation Letters
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
In human-robot cohabitation environments, robots lacking commonsense reasoning pose semantic-level safety hazards—e.g., placing a water cup above a laptop. Method: We propose a Semantic Safety Filtering framework that, for the first time, integrates large language models’ (LLMs) contextual commonsense reasoning into a safety-certified closed loop, mapping human-specified semantic constraints (e.g., “liquid containers must not be suspended above fragile devices”) to verifiable control barrier functions (CBFs). Our approach jointly leverages 3D semantic scene reconstruction, LLM-driven constraint generation, CBF-based safety certification, and diffusion-policy fine-tuning. Results: Evaluated in real kitchen settings for teleoperated and learning-based manipulation tasks, the framework reduces semantic violations significantly, increases safe action adoption by 37%, and achieves zero semantic safety incidents—surpassing conventional safety paradigms reliant solely on geometric collision checking.

Technology Category

Application Category

📝 Abstract
Ensuring safe interactions in human-centric environments requires robots to understand and adhere to constraints recognized by humans as “common sense” (e.g., “moving a cup of water above a laptop is unsafe as the water may spill” or “rotating a cup of water is unsafe as it can lead to pouring its content”). Recent advances in computer vision and machine learning have enabled robots to acquire a semantic understanding of and reason about their operating environments. While extensive literature on safe robot decision-making exists, semantic understanding is rarely integrated into these formulations. In this work, we propose a semantic safety filter framework to certify robot inputs with respect to semantically defined constraints (e.g., unsafe spatial relationships, behaviors, and poses) and geometrically defined constraints (e.g., environment-collision and self-collision constraints). In our proposed approach, given perception inputs, we build a semantic map of the 3D environment and leverage the contextual reasoning capabilities of large language models to infer semantically unsafe conditions. These semantically unsafe conditions are then mapped to safe actions through a control barrier certification formulation. We demonstrate the proposed semantic safety filter in teleoperated manipulation tasks and with learned diffusion policies applied in a real-world kitchen environment that further showcases its effectiveness in addressing practical semantic safety constraints. Together, these experiments highlight our approach's capability to integrate semantics into safety certification, enabling safe robot operation beyond traditional collision avoidance.
Problem

Research questions and friction points this paper is trying to address.

Ensures robots adhere to human common sense safety constraints
Integrates semantic understanding with safe robot decision-making
Maps semantically unsafe conditions to safe actions using control barriers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic safety filter for robot inputs
LLMs infer unsafe conditions semantically
Control barrier certifies safe actions
🔎 Similar Papers
No similar papers found.
Lukas Brunke
Lukas Brunke
PhD Candidate, University of Toronto and Technical University of Munich
RoboticsControlMachine Learning
Y
Yanni Zhang
Learning Systems and Robotics Lab and the Munich Institute of Robotics and Machine Intelligence, Technical University of Munich, 80333 Munich, Germany
R
Ralf Romer
Learning Systems and Robotics Lab and the Munich Institute of Robotics and Machine Intelligence, Technical University of Munich, 80333 Munich, Germany
Jack Naimer
Jack Naimer
University of Toronto
Machine Learning
N
Nikola Staykov
Learning Systems and Robotics Lab and the Munich Institute of Robotics and Machine Intelligence, Technical University of Munich, 80333 Munich, Germany
S
Siqi Zhou
Learning Systems and Robotics Lab and the Munich Institute of Robotics and Machine Intelligence, Technical University of Munich, 80333 Munich, Germany
A
Angela P. Schoellig
Learning Systems and Robotics Lab and the Munich Institute of Robotics and Machine Intelligence, Technical University of Munich, 80333 Munich, Germany; University of Toronto Institute for Aerospace Studies, North York, ON M3H 5T6, Canada; University of Toronto Robotics Institute, Toronto, ON M5S 1A4, Canada; Vector Institute for Artificial Intelligence, Toronto, ON M5G 0C6, Canada