ReTreVal: Reasoning Tree with Validation - A Hybrid Framework for Enhanced LLM Multi-Step Reasoning

📅 2026-01-06
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited capacity of large language models (LLMs) to perform structured, multi-step reasoning in complex mathematical and creative writing tasks, particularly their difficulty in exploring alternative reasoning paths and transferring knowledge across problems. To overcome this, we propose ReTreVal, a novel framework that integrates Tree-of-Thoughts reasoning, self-optimization, LLM-generated critical scoring, and persistent reflective memory to construct a structured reasoning tree with dual verification mechanisms. The framework supports dynamic pruning via top-k retention and adaptive reasoning depth. Experimental results on 500 mathematical and creative writing tasks demonstrate that ReTreVal, built upon Qwen 2.5 7B, significantly outperforms ReAct, Reflexion, and Self-Refine, substantially enhancing both the quality of exploratory reasoning and the efficiency of cross-task knowledge reuse.

Technology Category

Application Category

📝 Abstract
Multi-step reasoning remains a key challenge for Large Language Models (LLMs), particularly in complex domains such as mathematics and creative writing. While recent approaches including ReAct, Reflexion, and Self-Refine improve reasoning through iterative refinement and reflection, they often lack structured exploration of alternative solution paths and persistent learning across problems. We propose ReTreVal (Reasoning Tree with Validation), a hybrid framework that integrates Tree-of-Thoughts exploration, self-refinement, LLM-based critique scoring, and reflexion memory to enable bounded and validated multi-step reasoning. ReTreVal constructs a structured reasoning tree with adaptive depth based on problem complexity, where each node undergoes iterative self-critique and refinement guided by explicit LLM-generated feedback. A dual validation mechanism evaluates reasoning quality, coherence, and correctness at each node while persistently storing insights from successful reasoning paths and failure patterns in a reflexion memory buffer, enabling cross-problem learning. Critique-based pruning retains only the top-k highest-scoring nodes at each level, controlling computational cost while preserving high-quality solution paths. We evaluate ReTreVal against ReAct, Reflexion, and Self-Refine across 500 mathematical problems and creative writing tasks using Qwen 2.5 7B as the underlying LLM, and demonstrate that ReTreVal consistently outperforms existing methods through its combination of structured exploration, critique-driven refinement, and cross-problem memory, making it particularly effective for tasks requiring exploratory reasoning, rigorous verification, and knowledge transfer.
Problem

Research questions and friction points this paper is trying to address.

multi-step reasoning
Large Language Models
structured exploration
cross-problem learning
reasoning validation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reasoning Tree
Self-Critique
Validation Mechanism
Reflexion Memory
Multi-Step Reasoning
🔎 Similar Papers
No similar papers found.
A
Abhishek HS
QpiAI, Bengaluru, India
P
Pavan C. Shekar
QpiAI, Bengaluru, India
Arpit Jain
Arpit Jain
GE Global Research, University of Maryland College Park
Computer VisionDeep LearningPattern RecognitionImage Processing
A
Ashwanth Krishnan
QpiAI, Bengaluru, India