RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic

πŸ“… 2025-12-24
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Embodied agents operating in dynamic, temporally sensitive, and context-rich environments are vulnerable to implicit hazardous instructions, leading to unsafe behaviorsβ€”a challenge inadequately addressed by static rule-based or prompt-level safety mechanisms. Method: We propose the first runtime safety framework grounded in executable predicate logic, integrating a hybrid long- and short-term safety memory with bidirectional reasoning (backward reflection and forward prediction) to enable dynamic contextual awareness, formal verifiability, and executable safety decisions. The framework unifies vision-language models, multimodal perception fusion, real-time trajectory backtracking, and risk-prediction inference. Contribution/Results: Evaluated across diverse embodied agents, our framework reduces hazardous behavior incidence by 36.8% while incurring negligible task performance degradation (<2%). It is further validated on a physical robotic arm, demonstrating real-world deployability and robustness. This work establishes a foundation for formally grounded, runtime-enforced safety in embodied AI systems.

Technology Category

Application Category

πŸ“ Abstract
Embodied agents powered by vision-language models (VLMs) are increasingly capable of executing complex real-world tasks, yet they remain vulnerable to hazardous instructions that may trigger unsafe behaviors. Runtime safety guardrails, which intercept hazardous actions during task execution, offer a promising solution due to their flexibility. However, existing defenses often rely on static rule filters or prompt-level control, which struggle to address implicit risks arising in dynamic, temporally dependent, and context-rich environments. To address this, we propose RoboSafe, a hybrid reasoning runtime safeguard for embodied agents through executable predicate-based safety logic. RoboSafe integrates two complementary reasoning processes on a Hybrid Long-Short Safety Memory. We first propose a Backward Reflective Reasoning module that continuously revisits recent trajectories in short-term memory to infer temporal safety predicates and proactively triggers replanning when violations are detected. We then propose a Forward Predictive Reasoning module that anticipates upcoming risks by generating context-aware safety predicates from the long-term safety memory and the agent's multimodal observations. Together, these components form an adaptive, verifiable safety logic that is both interpretable and executable as code. Extensive experiments across multiple agents demonstrate that RoboSafe substantially reduces hazardous actions (-36.8% risk occurrence) compared with leading baselines, while maintaining near-original task performance. Real-world evaluations on physical robotic arms further confirm its practicality. Code will be released upon acceptance.
Problem

Research questions and friction points this paper is trying to address.

Safeguarding embodied agents from hazardous instructions triggering unsafe behaviors
Addressing implicit risks in dynamic, temporally dependent, and context-rich environments
Overcoming limitations of static rule filters and prompt-level control for safety
Innovation

Methods, ideas, or system contributions that make the work stand out.

Executable predicate-based safety logic for runtime safeguard
Hybrid reasoning with backward reflection and forward prediction
Adaptive verifiable safety logic interpretable as executable code
πŸ”Ž Similar Papers
No similar papers found.
L
Le Wang
Beihang University
Zonghao Ying
Zonghao Ying
SKLCCSE, BUAA
Trustworthy AI
X
Xiao Yang
Beijing University of Posts and Telecommunications
Q
Quanchen Zou
360 AI Security Lab
Zhenfei Yin
Zhenfei Yin
University of Oxford
Deep LearningMultimodalAI AgentRobotics
Tianlin Li
Tianlin Li
Nanyang Technological University
AI4SESE4AITrustworthy AI
J
Jian Yang
Beihang University
Y
Yaodong Yang
Peking University
A
Aishan Liu
Beihang University
X
Xianglong Liu
Beihang University