PlatoLTL: Learning to Generalize Across Symbols in LTL Instructions for Multi-Task RL

📅 2026-01-30

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This work addresses the challenge of generalizing policies to unseen linear temporal logic (LTL) propositional symbols in multi-task reinforcement learning. Existing approaches struggle to transfer knowledge when encountering novel atomic propositions not observed during training. To overcome this limitation, the authors propose treating LTL propositions as parameterized predicates rather than discrete symbols and introduce a novel neural architecture that embeds and composes these predicates to construct generalizable representations of LTL specifications. This approach enables, for the first time, zero-shot generalization under LTL guidance to previously unseen propositions, moving beyond prior methods that only generalize across formula structures. Experimental results demonstrate that the proposed framework significantly enhances policy adaptability and generalization across multiple complex environments, both to new propositions and to new tasks specified by LTL formulas.

Technology Category

Application Category

📝 Abstract

A central challenge in multi-task reinforcement learning (RL) is to train generalist policies capable of performing tasks not seen during training. To facilitate such generalization, linear temporal logic (LTL) has recently emerged as a powerful formalism for specifying structured, temporally extended tasks to RL agents. While existing approaches to LTL-guided multi-task RL demonstrate successful generalization across LTL specifications, they are unable to generalize to unseen vocabularies of propositions (or"symbols"), which describe high-level events in LTL. We present PlatoLTL, a novel approach that enables policies to zero-shot generalize not only compositionally across LTL formula structures, but also parametrically across propositions. We achieve this by treating propositions as instances of parameterized predicates rather than discrete symbols, allowing policies to learn shared structure across related propositions. We propose a novel architecture that embeds and composes predicates to represent LTL specifications, and demonstrate successful zero-shot generalization to novel propositions and tasks across challenging environments.

Problem

Research questions and friction points this paper is trying to address.

multi-task reinforcement learning

linear temporal logic

zero-shot generalization

symbolic generalization

proposition vocabulary

Innovation

Methods, ideas, or system contributions that make the work stand out.

zero-shot generalization

parameterized predicates

linear temporal logic