🤖 AI Summary
Existing memory-augmented agents passively rely on pre-existing information and struggle to proactively acquire external knowledge under uncertainty, limiting both the accuracy and scalability of their memory. This work proposes an autonomous memory agent that actively and efficiently retrieves and verifies knowledge through a cost-increasing cascade of knowledge extraction and a semantic-aware Thompson sampling strategy. To mitigate cold-start bias, the approach integrates tool invocation with expert feedback mechanisms. Evaluated on HotpotQA using Qwen2.5-7B and AIME25 using Gemini-2.5-flash, the method achieves performance gains of 14.6 and 7.33 points, respectively, substantially outperforming current memory-based baselines and reinforcement learning–optimized approaches.
📝 Abstract
Recent memory agents improve LLMs by extracting experiences and conversation history into an external storage. This enables low-overhead context assembly and online memory update without expensive LLM training. However, existing solutions remain passive and reactive; memory growth is bounded by information that happens to be available, while memory agents seldom seek external inputs in uncertainties. We propose autonomous memory agents that actively acquire, validate, and curate knowledge at a minimum cost. U-Mem materializes this idea via (i) a cost-aware knowledge-extraction cascade that escalates from cheap self/teacher signals to tool-verified research and, only when needed, expert feedback, and (ii) semantic-aware Thompson sampling to balance exploration and exploitation over memories and mitigate cold-start bias. On both verifiable and non-verifiable benchmarks, U-Mem consistently beats prior memory baselines and can surpass RL-based optimization, improving HotpotQA (Qwen2.5-7B) by 14.6 points and AIME25 (Gemini-2.5-flash) by 7.33 points.