Growing Through Experience: Scaling Episodic Grounding in Language Models

📅 2025-06-02
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the inefficient utilization of episodic experience by large language models (LLMs) in physical planning tasks. Specifically, medium-scale models (e.g., 7B) suffer from weak situational grounding, while large-scale models (70–405B), despite superior abstraction capabilities, exhibit a “scale paradox” that impedes effective integration of sequential experience. To bridge this gap, we propose a cross-scale embodied experience transfer framework, introducing the first scalable “weak-to-strong” episodic learning paradigm. Our method integrates MCTS-guided experience acquisition with memory distillation—preserving original model capabilities—and incorporates hierarchical knowledge distillation and layer-wise probing analysis. Experiments demonstrate that our approach outperforms state-of-the-art closed-source LMs by 3.45% across diverse planning and question-answering benchmarks. Moreover, it significantly improves deep-layer representation alignment and enhances generalization stability on complex, unseen scenarios.

Technology Category

Application Category

📝 Abstract
Language models (LMs) require robust episodic grounding-the capacity to learn from and apply past experiences-to excel at physical planning tasks. Current episodic grounding approaches struggle with scalability and integration, limiting their effectiveness, especially for medium-sized LMs (7B parameters). While larger LMs (70-405B parameters) possess superior hierarchical representations and extensive pre-trained knowledge, they encounter a fundamental scale paradox: despite their advanced abstraction capabilities, they lack efficient mechanisms to leverage experience streams. We propose a scalable weak-to-strong episodic learning framework that effectively transfers episodic behaviors from smaller to larger LMs. This framework integrates Monte Carlo tree search for structured experience collection with a novel distillation method, preserving the inherent LM capabilities while embedding episodic memory. Experiments demonstrate our method surpasses state-of-the-art proprietary LMs by 3.45% across diverse planning and question-answering tasks. Layer-wise probing further indicates significant improvements in task alignment, especially within deeper LM layers, highlighting stable generalization even for previously unseen scenarios with increased planning complexity-conditions where baseline methods degrade markedly.
Problem

Research questions and friction points this paper is trying to address.

Scaling episodic grounding in medium-sized language models
Addressing scale paradox in large LMs' experience utilization
Enhancing episodic memory integration without capability loss
Innovation

Methods, ideas, or system contributions that make the work stand out.

Scalable weak-to-strong episodic learning framework
Monte Carlo tree search for experience collection
Novel distillation method preserving LM capabilities