🤖 AI Summary
In open, human-robot coexisting environments, robots struggle to interpret and enforce diverse safety constraints—spanning semantic rules (e.g., “avoid elderly people”) and geometric limits (e.g., “maintain 1-meter distance”)—embedded in natural language instructions, especially under real-time operational demands.
Method: This paper proposes the first end-to-end language-conditioned safety filtering framework. It integrates a large language model (LLM) for unstructured instruction parsing, object-level 3D environment representation for joint semantic–geometric modeling, and model predictive control (MPC) for real-time safe action filtering.
Contribution/Results: The framework uniquely automates the translation of unstructured safety directives into executable, structured safety specifications, unifying semantic and geometric constraint handling. Evaluated in simulation and on real robotic platforms, it significantly improves generalization, robustness, and practicality of safe navigation, enabling trustworthy human–robot collaboration in complex, dynamic settings.
📝 Abstract
As robots become increasingly integrated into open-world, human-centered environments, their ability to interpret natural language instructions and adhere to safety constraints is critical for effective and trustworthy interaction. Existing approaches often focus on mapping language to reward functions instead of safety specifications or address only narrow constraint classes (e.g., obstacle avoidance), limiting their robustness and applicability. We propose a modular framework for language-conditioned safety in robot navigation. Our framework is composed of three core components: (1) a large language model (LLM)-based module that translates free-form instructions into structured safety specifications, (2) a perception module that grounds these specifications by maintaining object-level 3D representations of the environment, and (3) a model predictive control (MPC)-based safety filter that enforces both semantic and geometric constraints in real time. We evaluate the effectiveness of the proposed framework through both simulation studies and hardware experiments, demonstrating that it robustly interprets and enforces diverse language-specified constraints across a wide range of environments and scenarios.