๐ค AI Summary
This work addresses the inefficiency in parallel agent systems caused by redundant computation when multiple teams independently reason over similar subproblems. To mitigate this resource waste, the authors propose LTS, a learnable shared memory mechanism that enables selective reuse of intermediate results across teams through a global memory bank and a lightweight controller. LTS innovatively incorporates reinforcement learningโbased memory admission control and a usage-aware credit assignment scheme to automatically identify and share high-value information. This approach effectively curbs unbounded context growth while enhancing system efficiency. Experimental results on AssistantBench and GAIA benchmarks demonstrate that LTS significantly reduces runtime compared to memoryless parallel baselines, while maintaining or even improving task performance.
๐ Abstract
Agentic systems solve complex tasks by coordinating multiple agents that iteratively reason, invoke tools, and exchange intermediate results. To improve robustness and solution quality, recent approaches deploy multiple agent teams running in parallel to explore diverse reasoning trajectories. However, parallel execution comes at a significant computational cost: when different teams independently reason about similar sub-problems or execute analogous steps, they repeatedly perform substantial overlapping computation. To address these limitations, in this paper, we propose Learning to Share (LTS), a learned shared-memory mechanism for parallel agentic frameworks that enables selective cross-team information reuse while controlling context growth. LTS introduces a global memory bank accessible to all teams and a lightweight controller that decides whether intermediate agent steps should be added to memory or not. The controller is trained using stepwise reinforcement learning with usage-aware credit assignment, allowing it to identify information that is globally useful across parallel executions. Experiments on the AssistantBench and GAIA benchmarks show that LTS significantly reduces overall runtime while matching or improving task performance compared to memory-free parallel baselines, demonstrating that learned memory admission is an effective strategy for improving the efficiency of parallel agentic systems. Project page: https://joefioresi718.github.io/LTS_webpage/