Safety Assessment in Reinforcement Learning via Model Predictive Control

📅 2025-10-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Formalizing safety constraints in reinforcement learning remains challenging due to the difficulty of specifying explicit safety criteria or accurate system dynamics a priori. Method: This paper proposes a model-free safety framework grounded in reversibility—using state reversibility as an implicit, knowledge-free safety criterion. It integrates Model Predictive Path Integral (MPPI) control with real-time reversibility assessment during policy training, dynamically intercepting irreversible (i.e., potentially unsafe) actions via black-box environment queries. A decoupled safety evaluation architecture ensures orthogonality between safety enforcement and policy optimization. Contribution/Results: The approach achieves 100% interception of unsafe actions while matching the training efficiency and task performance of PPO baselines. It is the first work to introduce reversibility as a principled foundation for RL safety control, offering a theoretically interpretable, lightweight, and general-purpose safety paradigm for implicit safety constraints.

Technology Category

Application Category

📝 Abstract
Model-free reinforcement learning approaches are promising for control but typically lack formal safety guarantees. Existing methods to shield or otherwise provide these guarantees often rely on detailed knowledge of the safety specifications. Instead, this work's insight is that many difficult-to-specify safety issues are best characterized by invariance. Accordingly, we propose to leverage reversibility as a method for preventing these safety issues throughout the training process. Our method uses model-predictive path integral control to check the safety of an action proposed by a learned policy throughout training. A key advantage of this approach is that it only requires the ability to query the black-box dynamics, not explicit knowledge of the dynamics or safety constraints. Experimental results demonstrate that the proposed algorithm successfully aborts before all unsafe actions, while still achieving comparable training progress to a baseline PPO approach that is allowed to violate safety.
Problem

Research questions and friction points this paper is trying to address.

Ensuring safety in reinforcement learning without explicit dynamics knowledge
Preventing unsafe actions through reversibility and invariance principles
Validating policy actions via model predictive control during training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using reversibility to prevent safety issues
Applying model-predictive path integral control
Querying black-box dynamics without explicit knowledge
🔎 Similar Papers
No similar papers found.