🤖 AI Summary
This paper investigates the theoretical mechanisms underlying collaborative reasoning between pretrained priors and external information (e.g., retrieval-augmented generation, tool calling) during test-time enhancement of large language models. We propose a knowledge-graph-based modeling framework that formalizes multi-step reasoning as a source-to-target connectivity problem, and introduce sublinear graph algorithms to characterize the relationship between prior knowledge density and oracle query efficiency. We establish, for the first time, a phase-transition phenomenon in knowledge graphs: when prior knowledge density exceeds a critical threshold—inducing a giant connected component—the expected number of queries required for successful reasoning becomes constant; below this threshold, the query complexity lower bound is Ω(√n). This result quantifies the minimal pretrained knowledge volume necessary for efficient test-time reasoning and provides a verifiable theoretical foundation for designing knowledge-aware augmentation strategies.
📝 Abstract
Test-time augmentation, such as Retrieval-Augmented Generation (RAG) or tool use, critically depends on an interplay between a model's parametric knowledge and externally retrieved information. However, the theoretical underpinnings of this relationship remain poorly understood. Specifically, it is not clear how much pre-training knowledge is required to answer queries with a small number of augmentation steps, which is a desirable property in practice. To address this question, we formulate multi-step reasoning as an $s$-$t$ connectivity problem on a knowledge graph. We represent a model's pre-training parametric knowledge as a partial, potentially noisy subgraph. We view augmentation as querying an oracle for true edges that augment the model's knowledge. Then, we characterize the necessary and sufficient number of augmentation steps for the model to generate an accurate answer given partial prior knowledge. One key result shows a phase transition: if the prior knowledge graph over $n$ vertices is disconnected into small components, then finding a path via augmentation is inefficient and requires $Ω(sqrt{n})$ queries. On the other hand, once the density of correct knowledge surpasses a threshold, forming a giant component, we can find paths with an expected constant number of queries.