🤖 AI Summary
Large language models (LLMs) exhibit low accuracy and unreliable outputs on complex reasoning tasks due to their opaque, end-to-end inference mechanisms.
Method: This paper introduces the “verbalized algorithms” paradigm, which decomposes tasks into atomic, natural-language-specified operations—e.g., binary comparisons—and embeds LLMs as trusted, deterministic primitive operators within classical algorithmic frameworks (e.g., bitonic sorting networks, hierarchical clustering). By strictly constraining LLMs to execute only simple, verifiable subroutines, the approach eliminates black-box reasoning while preserving algorithmic guarantees.
Contribution/Results: The paradigm significantly enhances process interpretability and output stability. Experiments on sorting and clustering tasks demonstrate substantial improvements in result consistency and accuracy over standard LLM-based methods, validating the efficacy and practicality of algorithm-structured guidance for LLM reasoning.
📝 Abstract
Instead of querying LLMs in a one-shot manner and hoping to get the right answer for a reasoning task, we propose a paradigm we call emph{verbalized algorithms} (VAs), which leverage classical algorithms with established theoretical understanding. VAs decompose a task into simple elementary operations on natural language strings that they should be able to answer reliably, and limit the scope of LLMs to only those simple tasks. For example, for sorting a series of natural language strings, emph{verbalized sorting} uses an LLM as a binary comparison oracle in a known and well-analyzed sorting algorithm (e.g., bitonic sorting network). We demonstrate the effectiveness of this approach on sorting and clustering tasks.