🤖 AI Summary
This paper theoretically compares two paradigms for adapting large language models (LLMs) to new tasks—supervised fine-tuning (SFT) and Best-of-N selection—in the context of bit-string generation. It establishes the first rigorous convergence-rate analysis under both realizable and unrealizable settings, characterizing fundamental trade-offs between response length and sample size. Under realizability, SFT achieves faster convergence with weaker dependence on sequence length. Under unrealizability, Best-of-N attains superior convergence rates and greater robustness to increasing response length in specific failure regimes. The analysis integrates probabilistic modeling, reward modeling, and sequential decision-making theory, yielding the first theoretically grounded framework for comparing convergence properties of LLM adaptation methods.
📝 Abstract
Using the bit string generation problem as a case study, we theoretically compare two standard methods for adapting large language models to new tasks. The first, referred to as supervised fine-tuning, involves training a new next token predictor on good generations. The second method, Best-of-N, trains a reward model to select good responses from a collection generated by an unaltered base model. If the learning setting is realizable, we find that supervised fine-tuning outperforms BoN through a better dependence on the response length in its rate of convergence. If realizability fails, then depending on the failure mode, BoN can enjoy a better rate of convergence in either n or a rate of convergence with better dependence on the response length.