🤖 AI Summary
Current LLM-based algorithm design relies heavily on empirical trial-and-error, lacking formal theoretical foundations to systematically analyze how critical design choices—such as task decomposition strategies and prompt engineering—affect accuracy and computational efficiency.
Method: We propose the first formal analytical framework for LLM-invocation algorithms, modeling LLM subroutines as a computational graph, establishing structured principles for task decomposition, and introducing an error propagation model that enables provable analysis of both accuracy and computational complexity.
Contribution/Results: Empirically validated across parallel, hierarchical, and recursive paradigms, our framework explains observed empirical phenomena, guides prompt design and granularity selection, predicts performance bottlenecks, and inspires novel robust algorithm designs. The implementation is publicly available.
📝 Abstract
We initiate a formal investigation into the design and analysis of LLM-based algorithms, i.e. algorithms that contain one or multiple calls of large language models (LLMs) as sub-routines and critically rely on the capabilities of LLMs. While LLM-based algorithms, ranging from basic LLM calls with prompt engineering to complicated LLM-powered agent systems and compound AI systems, have achieved remarkable empirical success, the design and optimization of them have mostly relied on heuristics and trial-and-errors, which is largely due to a lack of formal and analytical study for these algorithms. To fill this gap, we start by identifying the computational-graph representation of LLM-based algorithms, the design principle of task decomposition, and some key abstractions, which then facilitate our formal analysis for the accuracy and efficiency of LLM-based algorithms, despite the black-box nature of LLMs. Through extensive analytical and empirical investigation in a series of case studies, we demonstrate that the proposed framework is broadly applicable to a wide range of scenarios and diverse patterns of LLM-based algorithms, such as parallel, hierarchical and recursive task decomposition. Our proposed framework holds promise for advancing LLM-based algorithms, by revealing the reasons behind curious empirical phenomena, guiding the choices of hyperparameters, predicting the empirical performance of algorithms, and inspiring new algorithm design. To promote further study of LLM-based algorithms, we release our source code at https://github.com/modelscope/agentscope/tree/main/examples/paper_llm_based_algorithm.