Pheromone-based Learning of Optimal Reasoning Paths

πŸ“… 2025-01-31
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Large language models (LLMs) struggle to efficiently search for optimal intermediate reasoning steps in complex reasoning tasks. Method: This paper proposes Ant Colony Optimization-guided Tree-of-Thought (ACO-ToT), the first framework integrating biologically inspired ant pheromone mechanisms with Hebbian learning principles into LLM-based reasoning path search. It employs multiple expert-fine-tuned LLM β€œants” that collaboratively explore the reasoning space, dynamically update path scores, and synergistically leverage Tree-of-Thought (ToT), multi-LLM collaborative fine-tuning, and a hybrid scoring function. Contribution/Results: ACO-ToT achieves significant performance gains over state-of-the-art chain-of-thought optimization methods on GSM8K, ARC-Challenge, and MATH benchmarks, demonstrating that bio-inspired collective search substantially enhances LLMs’ complex reasoning capabilities.

Technology Category

Application Category

πŸ“ Abstract
Large Language Models (LLMs) have demonstrated remarkable reasoning capabilities through chain-of-thought prompting, yet discovering effective reasoning methods for complex problems remains challenging due to the vast space of possible intermediate steps. We introduce Ant Colony Optimization-guided Tree of Thought (ACO-ToT), a novel algorithm that combines ACO with LLMs to discover optimal reasoning paths for complex problems efficiently. Drawing inspiration from Hebbian learning in neurological systems, our method employs a collection of distinctly fine-tuned LLM"ants"to traverse and lay pheromone trails through a centralized tree of thought, with each ant's movement governed by a weighted combination of existing pheromone trails and its own specialized expertise. The algorithm evaluates complete reasoning paths using a mixture-of-experts-based scoring function, with pheromones reinforcing productive reasoning paths across iterations. Experiments on three challenging reasoning tasks (GSM8K, ARC-Challenge, and MATH) demonstrate that ACO-ToT performs significantly better than existing chain-of-thought optimization approaches, suggesting that incorporating biologically inspired collective search mechanisms into LLM inference can substantially enhance reasoning capabilities.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Complex Problem Solving
Performance Enhancement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Ant Colony Optimization
Thoughts Tree Strategy
Large Language Models Optimization
A
Anirudh Chari
Massachusetts Institute of Technology, Cambridge, MA, USA; a37.ai, San Francisco, CA, USA
A
Aditya Tiwari
Massachusetts Institute of Technology, Cambridge, MA, USA; a37.ai, San Francisco, CA, USA
R
Richard Lian
Massachusetts Institute of Technology, Cambridge, MA, USA; a37.ai, San Francisco, CA, USA
S
Suraj Reddy
Massachusetts Institute of Technology, Cambridge, MA, USA; a37.ai, San Francisco, CA, USA
Brian Zhou
Brian Zhou
Statistics and Computer Science, Harvard University
Applied StatisticsAIEconomics and ComputationGrand StrategyMulti-Agent Systems