ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning

📅 2025-10-15

📈 Citations: 0

✨ Influential: 0

career value

239K/year

🤖 AI Summary

Reward design in reinforcement learning (RL) is highly sensitive and exhibits poor generalization, severely limiting practical deployment. This paper introduces RewardMachine-LLM, an automated reward generation framework that integrates large language models (LLMs) with reward machines (RMs)—finite-state automata encoding reward logic. It is the first method to enable end-to-end parsing from natural language task descriptions to structured RMs, leveraging state-level linguistic embeddings for cross-task semantic alignment and enabling zero-shot reward transfer. Unlike manual reward engineering or supervised reward modeling, RewardMachine-LLM significantly improves task success rates across multiple complex, sparse-reward environments (+32.7% on average), without task-specific fine-tuning. Its core contributions are: (i) language-driven, automatic synthesis of reward structure; (ii) a semantic embedding-based RM state generalization mechanism; and (iii) the first general-purpose framework supporting zero-shot reward migration.

Technology Category

Application Category

📝 Abstract

Reinforcement learning (RL) algorithms are highly sensitive to reward function specification, which remains a central challenge limiting their broad applicability. We present ARM-FM: Automated Reward Machines via Foundation Models, a framework for automated, compositional reward design in RL that leverages the high-level reasoning capabilities of foundation models (FMs). Reward machines (RMs) -- an automata-based formalism for reward specification -- are used as the mechanism for RL objective specification, and are automatically constructed via the use of FMs. The structured formalism of RMs yields effective task decompositions, while the use of FMs enables objective specifications in natural language. Concretely, we (i) use FMs to automatically generate RMs from natural language specifications; (ii) associate language embeddings with each RM automata-state to enable generalization across tasks; and (iii) provide empirical evidence of ARM-FM's effectiveness in a diverse suite of challenging environments, including evidence of zero-shot generalization.

Problem

Research questions and friction points this paper is trying to address.

Automates reward machine generation from natural language specifications

Enables compositional reinforcement learning via foundation models

Addresses reward function sensitivity in reinforcement learning algorithms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated reward machine generation from natural language

Language embeddings enable generalization across tasks

Foundation models provide zero-shot generalization in RL

🔎 Similar Papers

Reward Machines for Deep RL in Noisy and Uncertain Environments

2024-05-31arXiv.orgCitations: 2

Anthropic

$500,000—$850,000 USD

San Francisco, CA, USA

Authors to Follow