🤖 AI Summary
This work addresses the challenge of learning logical behavioral rules from demonstrations with limited temporal constraints. We propose Inverse Logical Constraint Learning (ILCL), a framework that formulates specification learning as a two-player game between a generator and a discriminator, enabling automatic discovery of parameterized Linear Temporal Logic (LTL) specifications without predefined templates. Our method integrates Genetic Algorithm–driven Temporal Logic Mining (GA-TL-Mining) with Logic-Constrained Reinforcement Learning (Logic-CRL), incorporating syntax-tree-guided constraint optimization and dynamic constraint reallocation to effectively capture non-Markovian temporal dependencies. Evaluated on four benchmark tasks, ILCL significantly outperforms existing approaches and successfully transfers to a real-world peg-in-shallow-hole insertion task, demonstrating strong generalization and practical applicability. The core contribution is an end-to-end, template-free framework that jointly learns temporal logic specifications and optimizes policies in a unified manner.
📝 Abstract
We aim to solve the problem of temporal-constraint learning from demonstrations to reproduce demonstration-like logic-constrained behaviors. Learning logic constraints is challenging due to the combinatorially large space of possible specifications and the ill-posed nature of non-Markovian constraints. To figure it out, we introduce a novel temporal-constraint learning method, which we call inverse logic-constraint learning (ILCL). Our method frames ICL as a two-player zero-sum game between 1) a genetic algorithm-based temporal-logic mining (GA-TL-Mining) and 2) logic-constrained reinforcement learning (Logic-CRL). GA-TL-Mining efficiently constructs syntax trees for parameterized truncated linear temporal logic (TLTL) without predefined templates. Subsequently, Logic-CRL finds a policy that maximizes task rewards under the constructed TLTL constraints via a novel constraint redistribution scheme. Our evaluations show ILCL outperforms state-of-the-art baselines in learning and transferring TL constraints on four temporally constrained tasks. We also demonstrate successful transfer to real-world peg-in-shallow-hole tasks.