Guided Exploration for Efficient Relational Model Learning

📅 2025-02-10

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

To address the sample inefficiency and poor generalization of relational model learning in large-scale, complex environments under long-horizon tasks, this paper proposes a guided exploration framework based on operator initialization and precondition refinement. Unlike blind random exploration or goal-language-informed baseline (GLIB) methods, our approach introduces two novel principles: (i) *operator initialization*, leveraging oracle demonstrations that cover lifted effects, and (ii) *precondition-directed guidance*, which selects goal-action pairs and executes plans conditioned on learned preconditions. This oracle-driven, precondition-aware exploration enables efficient and semantically grounded policy acquisition. Evaluated on the Baking-Large benchmark, our method achieves significant improvements in sample efficiency, relational model accuracy, and cross-task generalization performance.

Technology Category

Application Category

📝 Abstract

Efficient exploration is critical for learning relational models in large-scale environments with complex, long-horizon tasks. Random exploration methods often collect redundant or irrelevant data, limiting their ability to learn accurate relational models of the environment. Goal-literal babbling (GLIB) improves upon random exploration by setting and planning to novel goals, but its reliance on random actions and random novel goal selection limits its scalability to larger domains. In this work, we identify the principles underlying efficient exploration in relational domains: (1) operator initialization with demonstrations that cover the distinct lifted effects necessary for planning and (2) refining preconditions to collect maximally informative transitions by selecting informative goal-action pairs and executing plans to them. To demonstrate these principles, we introduce Baking-Large, a challenging domain with extensive state-action spaces and long-horizon tasks. We evaluate methods using oracle-driven demonstrations for operator initialization and precondition-targeting guidance to efficiently gather critical transitions. Experiments show that both the oracle demonstrations and precondition-targeting oracle guidance significantly improve sample efficiency and generalization, paving the way for future methods to use these principles to efficiently learn accurate relational models in complex domains.

Problem

Research questions and friction points this paper is trying to address.

Enhances relational model learning efficiency

Reduces redundant data in exploration

Improves scalability in complex domains

Innovation

Methods, ideas, or system contributions that make the work stand out.

Operator initialization with demonstrations

Refining preconditions for informative transitions

Oracle-driven demonstrations for efficient learning

🔎 Similar Papers

No similar papers found.