DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications

📅 2024-10-06

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Existing approaches to zero-shot efficient satisfaction of unknown linear temporal logic (LTL) specifications in multi-task reinforcement learning suffer from limited coverage of LTL fragments, suboptimal solutions, and insufficient guarantees for safety constraints. Method: We propose the first end-to-end differentiable policy network that directly embeds the semantic structure of Büchi automata into the policy learning framework. Our approach integrates automaton-guided state encoding, truth-value trajectory sequence modeling, and LTL-aware reward shaping to uniformly support arbitrary finite- and infinite-horizon LTL specifications while strictly enforcing safety constraints. Contribution/Results: Experiments demonstrate significant improvements in LTL satisfaction rates and convergence speed across both discrete and continuous control tasks. Our method achieves superior zero-shot generalization over state-of-the-art baselines and is publicly available as open-source code.

Technology Category

Application Category

📝 Abstract

Linear temporal logic (LTL) has recently been adopted as a powerful formalism for specifying complex, temporally extended tasks in multi-task reinforcement learning (RL). However, learning policies that efficiently satisfy arbitrary specifications not observed during training remains a challenging problem. Existing approaches suffer from several shortcomings: they are often only applicable to finite-horizon fragments of LTL, are restricted to suboptimal solutions, and do not adequately handle safety constraints. In this work, we propose a novel learning approach to address these concerns. Our method leverages the structure of B""uchi automata, which explicitly represent the semantics of LTL specifications, to learn policies conditioned on sequences of truth assignments that lead to satisfying the desired formulae. Experiments in a variety of discrete and continuous domains demonstrate that our approach is able to zero-shot satisfy a wide range of finite- and infinite-horizon specifications, and outperforms existing methods in terms of both satisfaction probability and efficiency. Code available at: https://deep-ltl.github.io/

Problem

Research questions and friction points this paper is trying to address.

Learning policies for complex LTL specifications in RL

Handling unseen specifications and safety constraints effectively

Improving satisfaction probability and efficiency over existing methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages Büchi automata for LTL semantics

Policies conditioned on truth assignments

Zero-shot satisfies diverse LTL specifications

🔎 Similar Papers

Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning