🤖 AI Summary
This work addresses key limitations in existing neural symbolic transfer methods, which rely on handcrafted task automata, support only single-source transfer, and lack adaptability to shifts in source-task relevance. To overcome these challenges, the authors propose LANTERN, a framework that achieves, for the first time, fully automated multi-source neural symbolic transfer without human intervention. LANTERN leverages large language models to automatically generate deterministic finite automata from natural language task descriptions, aggregates policies from multiple sources via semantic embeddings, and introduces an adaptive gating mechanism that dynamically weights knowledge fusion based on temporal-difference error and semantic uncertainty. Experimental results demonstrate that LANTERN improves sample efficiency by 40–60% over baseline methods across resource management, navigation, and control tasks, while exhibiting strong robustness against irrelevant source tasks.
📝 Abstract
Transfer learning in reinforcement learning (RL) seeks to accelerate learning in new tasks by leveraging knowledge from related sources. Existing neurosymbolic transfer methods, however, typically rely on manually specified task automata, assume a single source task, and use fixed knowledge-integration mechanisms that cannot adapt to varying source relevance. We propose LANTERN, a unified framework for multi-source neurosymbolic transfer that addresses these limitations through three components: (i) deterministic finite automata generated from natural language task descriptions using large language models, (ii) semantic embedding-based aggregation of multiple source policies weighted by cross-task similarity, and (iii) adaptive teacher-student gating based on temporal-difference error and semantic uncertainty. Across domains spanning resource management, navigation, and control, LANTERN achieves 40-60% improvements in sample efficiency over existing baselines while remaining robust to poorly aligned sources. These results demonstrate that multi-source, adaptively weighted neurosymbolic transfer can improve scalability and robustness in symbolic RL settings.