LLM-First Search: Self-Guided Exploration of the Solution Space

📅 2025-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing tree search methods—such as Monte Carlo Tree Search (MCTS) and Tree-of-Thought (ToT)—rely on manually tuned exploration hyperparameters or external heuristics, rendering them inflexible to varying task difficulty and often leading to suboptimal efficiency or excessive computational cost. To address this, we propose LLM-First Search (LFS), the first large language model (LLM)-driven self-guided search paradigm. In LFS, the LLM autonomously performs path evaluation, branching decisions, and backtracking control solely based on contextual reasoning—eliminating the need for predefined hyperparameters, manual tuning, or task-specific adaptation. By internalizing tree search principles into the LLM’s intrinsic self-assessment and iterative reasoning mechanisms, LFS achieves difficulty-aware, context-sensitive inference search. Empirical evaluation on Countdown and Sudoku demonstrates that LFS significantly outperforms ToT-BFS, Best-First Search, and MCTS. Moreover, its performance scales consistently with stronger LLMs and increased compute, while maintaining superior computational efficiency.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have demonstrated remarkable improvements in reasoning and planning through increased test-time compute, often by framing problem-solving as a search process. While methods like Monte Carlo Tree Search (MCTS) have proven effective in some domains, their reliance on fixed exploration hyperparameters limits their adaptability across tasks of varying difficulty, rendering them impractical or expensive in certain settings. In this paper, we propose extbf{LLM-First Search (LFS)}, a novel extit{LLM Self-Guided Search} method that removes the need for pre-defined search strategies by empowering the LLM to autonomously control the search process via self-guided exploration. Rather than relying on external heuristics or hardcoded policies, the LLM evaluates whether to pursue the current search path or explore alternative branches based on its internal scoring mechanisms. This enables more flexible and context-sensitive reasoning without requiring manual tuning or task-specific adaptation. We evaluate LFS on Countdown and Sudoku against three classic widely-used search algorithms, Tree-of-Thoughts' Breadth First Search (ToT-BFS), Best First Search (BestFS), and MCTS, each of which have been used to achieve SotA results on a range of challenging reasoning tasks. We found that LFS (1) performs better on more challenging tasks without additional tuning, (2) is more computationally efficient compared to the other methods, especially when powered by a stronger model, (3) scales better with stronger models, due to its LLM-First design, and (4) scales better with increased compute budget. Our code is publicly available at href{https://github.com/NathanHerr/LLM-First-Search}{LLM-First-Search}.
Problem

Research questions and friction points this paper is trying to address.

Enables LLMs to autonomously control search processes without predefined strategies
Improves adaptability and efficiency in solving tasks of varying difficulty
Outperforms traditional search methods in challenging reasoning tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM autonomously controls search process
Self-guided exploration without fixed heuristics
Internal scoring for flexible path selection
🔎 Similar Papers
No similar papers found.
N
Nathan Herr
Centre for AI, University College London
T
Tim Rocktaschel
Centre for AI, University College London
Roberta Raileanu
Roberta Raileanu
Research Scientist at Google DeepMind, Honorary Lecturer at UCL
Artificial IntelligenceReinforcement LearningDeep LearningOpen-Ended Learning