Integrating LTL Constraints into PPO for Safe Reinforcement Learning

📅 2026-03-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of strictly satisfying complex temporal safety constraints in high-stakes reinforcement learning settings. It presents the first systematic integration of Linear Temporal Logic (LTL) formal specifications into the Proximal Policy Optimization (PPO) framework. The approach employs a limit-deterministic Büchi automaton to monitor violations of LTL constraints, translates logical violations into penalty signals via a logic-to-cost conversion mechanism, and leverages Lagrangian multipliers to guide policy optimization toward safe behavior. Evaluated in both Zones and CARLA environments, the method substantially reduces safety violations while maintaining task performance comparable to state-of-the-art algorithms, thereby providing verifiable guarantees for complex temporal safety requirements.

Technology Category

Application Category

📝 Abstract
This paper proposes Proximal Policy Optimization with Linear Temporal Logic Constraints (PPO-LTL), a framework that integrates safety constraints written in LTL into PPO for safe reinforcement learning. LTL constraints offer rigorous representations of complex safety requirements, such as regulations that broadly exist in robotics, enabling systematic monitoring of safety requirements. Violations against LTL constraints are monitored by limit-deterministic Büchi automata, and then translated by a logic-to-cost mechanism into penalty signals. The signals are further employed for guiding the policy optimization via the Lagrangian scheme. Extensive experiments on the Zones and CARLA environments show that our PPO-LTL can consistently reduce safety violations, while maintaining competitive performance, against the state-of-the-art methods. The code is at https://github.com/EVIEHub/PPO-LTL.
Problem

Research questions and friction points this paper is trying to address.

Safe Reinforcement Learning
Linear Temporal Logic
Safety Constraints
PPO
LTL
Innovation

Methods, ideas, or system contributions that make the work stand out.

LTL constraints
safe reinforcement learning
PPO
Büchi automata
Lagrangian optimization
🔎 Similar Papers
No similar papers found.