Pheromone-based Learning of Optimal Reasoning Paths

📅 2025-01-31

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Large language models (LLMs) struggle to efficiently search for optimal intermediate reasoning steps in complex reasoning tasks. Method: This paper proposes Ant Colony Optimization-guided Tree-of-Thought (ACO-ToT), the first framework integrating biologically inspired ant pheromone mechanisms with Hebbian learning principles into LLM-based reasoning path search. It employs multiple expert-fine-tuned LLM “ants” that collaboratively explore the reasoning space, dynamically update path scores, and synergistically leverage Tree-of-Thought (ToT), multi-LLM collaborative fine-tuning, and a hybrid scoring function. Contribution/Results: ACO-ToT achieves significant performance gains over state-of-the-art chain-of-thought optimization methods on GSM8K, ARC-Challenge, and MATH benchmarks, demonstrating that bio-inspired collective search substantially enhances LLMs’ complex reasoning capabilities.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have demonstrated remarkable reasoning capabilities through chain-of-thought prompting, yet discovering effective reasoning methods for complex problems remains challenging due to the vast space of possible intermediate steps. We introduce Ant Colony Optimization-guided Tree of Thought (ACO-ToT), a novel algorithm that combines ACO with LLMs to discover optimal reasoning paths for complex problems efficiently. Drawing inspiration from Hebbian learning in neurological systems, our method employs a collection of distinctly fine-tuned LLM"ants"to traverse and lay pheromone trails through a centralized tree of thought, with each ant's movement governed by a weighted combination of existing pheromone trails and its own specialized expertise. The algorithm evaluates complete reasoning paths using a mixture-of-experts-based scoring function, with pheromones reinforcing productive reasoning paths across iterations. Experiments on three challenging reasoning tasks (GSM8K, ARC-Challenge, and MATH) demonstrate that ACO-ToT performs significantly better than existing chain-of-thought optimization approaches, suggesting that incorporating biologically inspired collective search mechanisms into LLM inference can substantially enhance reasoning capabilities.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Complex Problem Solving

Performance Enhancement

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ant Colony Optimization

Thoughts Tree Strategy

Large Language Models Optimization

🔎 Similar Papers

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting