A Training-Free Regeneration Paradigm: Contrastive Reflection Memory Guided Self-Verification and Self-Improvement

📅 2026-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a training-free regeneration paradigm for self-improvement of large language models, addressing the longstanding trade-off between efficiency and accuracy in existing approaches. Current methods either incur high computational costs and risk entrenching erroneous reasoning through iterative refinement or fail to correct inherent flaws via multi-sampling strategies. The proposed approach introduces, for the first time, Contrastive Reflection Memory, which integrates a self-verification mechanism with a from-scratch regeneration strategy to enable the model to escape erroneous reasoning paths within a single inference pass. Evaluated across nine benchmarks spanning algorithmic, symbolic, and domain-specific tasks, the method significantly outperforms state-of-the-art techniques while maintaining low computational overhead.

Technology Category

Application Category

📝 Abstract
Verification-guided self-improvement has recently emerged as a promising approach to improving the accuracy of large language model (LLM) outputs. However, existing approaches face a trade-off between inference efficiency and accuracy: iterative verification-rectification is computationally expensive and prone to being trapped in faulty reasoning, while best-of-N selection requires extensive sampling without addressing internal model flaws. We propose a training-free regeneration paradigm that leverages an offline-curated contrastive Reflection Memory (RM) to provide corrective guidance, while regenerating from scratch helps break out of faulty reasoning. At inference time, the method performs RM-guided self-verification followed by a single RM-guided regeneration, avoiding both iterative correction and multi-sample selection. We evaluated our method on nine benchmarks that span algorithmic, reasoning, symbolic, and domain-specific tasks in both small- and large-scale LLMs. Experiment results show that our method outperforms prior methods while maintaining low computational cost.
Problem

Research questions and friction points this paper is trying to address.

self-improvement
verification
large language models
inference efficiency
reasoning errors
Innovation

Methods, ideas, or system contributions that make the work stand out.

training-free
contrastive reflection memory
self-verification
self-improvement
regeneration
🔎 Similar Papers
No similar papers found.