SENTINEL: A Multi-Level Formal Framework for Safety Evaluation of LLM-based Embodied Agents

📅 2025-10-14

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This paper addresses the challenge of safety assessment for large language model (LLM)-driven embodied agents operating in physical environments. We propose the first formal safety verification framework spanning three hierarchical levels: semantic understanding, high-level planning, and low-level trajectory execution. Methodologically, we introduce temporal logic (TL) systematically into embodied agent safety modeling—employing natural language–logic alignment detection for semantics, TL-based plan-level verification, and computation tree automaton–driven trajectory validation to enable cross-layer automated safety checking. Experiments on VirtualHome and ALFRED demonstrate that our framework effectively uncovers safety failures—including semantic misinterpretations, planning conflicts, and motion violations—missed by existing heuristic approaches. It thereby significantly enhances the rigor and interpretability of safety evaluation. Our work establishes a verifiable safety assurance paradigm for trustworthy embodied intelligence.

Technology Category

Application Category

📝 Abstract

We present Sentinel, the first framework for formally evaluating the physical safety of Large Language Model(LLM-based) embodied agents across the semantic, plan, and trajectory levels. Unlike prior methods that rely on heuristic rules or subjective LLM judgments, Sentinel grounds practical safety requirements in formal temporal logic (TL) semantics that can precisely specify state invariants, temporal dependencies, and timing constraints. It then employs a multi-level verification pipeline where (i) at the semantic level, intuitive natural language safety requirements are formalized into TL formulas and the LLM agent's understanding of these requirements is probed for alignment with the TL formulas; (ii) at the plan level, high-level action plans and subgoals generated by the LLM agent are verified against the TL formulas to detect unsafe plans before execution; and (iii) at the trajectory level, multiple execution trajectories are merged into a computation tree and efficiently verified against physically-detailed TL specifications for a final safety check. We apply Sentinel in VirtualHome and ALFRED, and formally evaluate multiple LLM-based embodied agents against diverse safety requirements. Our experiments show that by grounding physical safety in temporal logic and applying verification methods across multiple levels, Sentinel provides a rigorous foundation for systematically evaluating LLM-based embodied agents in physical environments, exposing safety violations overlooked by previous methods and offering insights into their failure modes.

Problem

Research questions and friction points this paper is trying to address.

Formally evaluates LLM-based embodied agents' physical safety across semantic, plan, and trajectory levels

Grounds safety requirements in temporal logic to specify invariants, dependencies, and timing constraints

Detects unsafe plans and violations through multi-level verification pipeline before execution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-level verification pipeline for safety evaluation

Formal temporal logic grounding physical safety requirements

Semantic, plan, and trajectory level safety verification

🔎 Similar Papers

GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning