Beyond Experience Retrieval: Learning to Generate Utility-Optimized Structured Experience for Frozen LLMs

πŸ“… 2026-01-30
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the tendency of static large language models to repeatedly commit reasoning errors and the limitations of existing retrieval-based experience reuse methods, which suffer from noise, high latency, and reliance solely on similarity matching. To overcome these issues, the authors propose SEAMβ€”a lightweight, executor-specific plug-in module that eschews conventional retrieval mechanisms by internalizing experience into learnable parameters. SEAM generates structured, instance-tailored experience entries via a single forward pass to guide frozen large language models toward improved reasoning. Trained with a utility-driven GRPO algorithm combined with executor rollouts and further refined through supervised fine-tuning on successful trajectory logs, SEAM enables post-deployment performance gains without modifying the main model. Experiments demonstrate significant accuracy improvements across multiple frozen executors on mathematical reasoning benchmarks, with minimal computational overhead, confirming the method’s effectiveness and robustness.

Technology Category

Application Category

πŸ“ Abstract
Large language models (LLMs) are largely static and often redo reasoning or repeat mistakes. Prior experience reuse typically relies on external retrieval, which is similarity-based, can introduce noise, and adds latency. We introduce SEAM (Structured Experience Adapter Module), a lightweight, executor-specific plug-in that stores experience in its parameters and generates a structured, instance-tailored experience entry in a single forward pass to guide a frozen LLM executor. SEAM is trained for utility via executor rollouts and GRPO while keeping the executor frozen, and it can be further improved after deployment with supervised fine-tuning on logged successful trajectories. Experiments on mathematical reasoning benchmarks show consistent accuracy gains across executors with low overhead. Extensive ablations and analyses further elucidate the mechanisms underlying SEAM's effectiveness and robustness.
Problem

Research questions and friction points this paper is trying to address.

experience reuse
frozen LLMs
reasoning efficiency
noise in retrieval
latency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structured Experience
Frozen LLMs
Experience Generation
Utility Optimization
Adapter Module
πŸ”Ž Similar Papers
No similar papers found.