Zero-Shot Instruction Following in RL via Structured LTL Representations

📅 2025-12-02

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the limited zero-shot instruction-following capability of agents in multi-task reinforcement learning when faced with unseen linear temporal logic (LTL) tasks. To overcome this challenge, the authors propose a structured task representation approach that translates LTL specifications into sequences of Boolean formulas constructed via finite automata. A hierarchical neural network is designed to encode both the logical and temporal structure of these formulas, augmented with an attention mechanism to enable reasoning over future subgoals. This representation effectively supports policy conditioning and significantly improves zero-shot generalization and task success rates across multiple complex environments, outperforming existing universal policy methods.

Technology Category

Application Category

📝 Abstract

We study instruction following in multi-task reinforcement learning, where an agent must zero-shot execute novel tasks not seen during training. In this setting, linear temporal logic (LTL) has recently been adopted as a powerful framework for specifying structured, temporally extended tasks. While existing approaches successfully train generalist policies, they often struggle to effectively capture the rich logical and temporal structure inherent in LTL specifications. In this work, we address these concerns with a novel approach to learn structured task representations that facilitate training and generalisation. Our method conditions the policy on sequences of Boolean formulae constructed from a finite automaton of the task. We propose a hierarchical neural architecture to encode the logical structure of these formulae, and introduce an attention mechanism that enables the policy to reason about future subgoals. Experiments in a variety of complex environments demonstrate the strong generalisation capabilities and superior performance of our approach.

Problem

Research questions and friction points this paper is trying to address.

zero-shot instruction following

reinforcement learning

linear temporal logic

task generalization

structured representation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-Shot Instruction Following

Linear Temporal Logic (LTL)

Structured Task Representation