🤖 AI Summary
This work addresses the limitation of language agents in self-reflection, where repetitive outputs often hinder performance. To overcome this, the authors propose ParamAgent, a novel framework featuring a parameterized memory module called ParamMem that encodes cross-sample reflection patterns directly into model parameters. By integrating temperature-controlled sampling to generate diverse reflection signals, ParamAgent effectively fuses parametric memory, episodic memory, and cross-sample memory. Notably, the approach achieves high sample efficiency in self-improvement without relying on external strong models and enables knowledge transfer from weaker to stronger models. Extensive experiments on code generation, mathematical reasoning, and multi-hop question answering demonstrate that ParamAgent significantly outperforms current state-of-the-art methods, confirming its effectiveness and generalization capability.
📝 Abstract
Self-reflection enables language agents to iteratively refine solutions, yet often produces repetitive outputs that limit reasoning performance. Recent studies have attempted to address this limitation through various approaches, among which increasing reflective diversity has shown promise. Our empirical analysis reveals a strong positive correlation between reflective diversity and task success, further motivating the need for diverse reflection signals. We introduce ParamMem, a parametric memory module that encodes cross-sample reflection patterns into model parameters, enabling diverse reflection generation through temperature-controlled sampling. Building on this module, we propose ParamAgent, a reflection-based agent framework that integrates parametric memory with episodic and cross-sample memory. Extensive experiments on code generation, mathematical reasoning, and multi-hop question answering demonstrate consistent improvements over state-of-the-art baselines. Further analysis reveals that ParamMem is sample-efficient, enables weak-to-strong transfer across model scales, and supports self-improvement without reliance on stronger external model, highlighting the potential of ParamMem as an effective component for enhancing language agents.