MemCoT: Test-Time Scaling through Memory-Driven Chain-of-Thought

📅 2026-04-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of hallucination and catastrophic forgetting in large language models when processing fragmented long contexts, which are exacerbated by existing static, single-step retrieval mechanisms that cause semantic dilution and contextual disconnection. To overcome these limitations, the authors propose the MemCoT framework, which formulates long-context reasoning as an iterative, stateful information-seeking process. MemCoT introduces a novel Zoom-In/Zoom-Out multi-perspective long-term memory mechanism and a task-conditioned dual short-term memory system that integrates semantic and episodic components to dynamically guide query decomposition and pruning. This approach enables test-time memory-driven chain-of-thought reasoning, achieving state-of-the-art performance across multiple open- and closed-source models on the LoCoMo and LongMemEval-S benchmarks.
📝 Abstract
Large Language Models (LLMs) still suffer from severe hallucinations and catastrophic forgetting during causal reasoning over massive, fragmented long contexts. Existing memory mechanisms typically treat retrieval as a static, single-step passive matching process, leading to severe semantic dilution and contextual fragmentation. To overcome these fundamental bottlenecks, we propose MemCoT, a test-time memory scaling framework that redefines the reasoning process by transforming long-context reasoning into an iterative, stateful information search. MemCoT introduces a multi-view long-term memory perception module that enables Zoom-In evidence localization and Zoom-Out contextual expansion, allowing the model to first identify where relevant evidence resides and then reconstruct the surrounding causal structure necessary for reasoning. In addition, MemCoT employs a task-conditioned dual short-term memory system composed of semantic state memory and episodic trajectory memory. This short-term memory records historical search decisions and dynamically guides query decomposition and pruning across iterations. Empirical evaluations demonstrate that MemCoT establishes a state-of-the-art performance. Empowered by MemCoT, several open- and closed-source models achieve SOTA performance on the LoCoMo benchmark and LongMemEval-S benchmark.
Problem

Research questions and friction points this paper is trying to address.

hallucination
catastrophic forgetting
long-context reasoning
memory mechanism
contextual fragmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Memory-Driven Chain-of-Thought
Test-Time Scaling
Long-Context Reasoning
Multi-View Memory Perception
Dual Short-Term Memory
🔎 Similar Papers
No similar papers found.
H
Haodong Lei
Southeast University, Nanjing, Jiangsu, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China
J
Junming Liu
Shanghai Artificial Intelligence Laboratory, Shanghai, China
Yirong Chen
Yirong Chen
Stanford University
Ding Wang
Ding Wang
Shanghai AI Lab
Artificial IntelligenceAgentic SystemDigital Twin
H
Hongsong Wang
Southeast University, Nanjing, Jiangsu, China