🤖 AI Summary
This work addresses the limitation of current large language models in code self-correction, which rely solely on stateless trial-and-error and fail to leverage historical experience effectively. To overcome this, we propose TextBFGS, a novel framework that, for the first time, adapts quasi-Newton optimization principles—specifically, Hessian-like semantic curvature modeling—to textual program repair. TextBFGS constructs a dynamic case repository of “error–operator” pairs and performs case-based reasoning using abstract correction patterns. By integrating error-aware retrieval, operator reuse, and continuous case-base updating, our method substantially enhances optimization efficiency. Empirical results on HumanEval and MBPP benchmarks demonstrate that TextBFGS achieves higher pass rates with fewer model invocations compared to stateless baselines.
📝 Abstract
Optimizing discrete executable text such as prompts and code has recently been framed as a gradient-based process, effectively translating backpropagation concepts to the semantic space. However, existing methods predominantly operate as first-order optimizers akin to Stochastic Gradient Descent, which are suffering from slow convergence and instability because they neglect the semantic curvature of the optimization landscape. To bridge this gap, we introduce TextBFGS, a second-order framework to implement a Quasi-Newton optimization method for discrete text. Unlike traditional memory-based approaches that retrieve similar textual instances, TextBFGS approximates the inverse Hessian matrix by retrieving Gradient-Operators from the memory of pre-learned successful trajectories. Specifically, given a textual gradient feedback, TextBFGS identifies historical correction patterns from the optimization knowledge base and tries to apply these abstract operators to the current variable. This mechanism enables a One-Pass Update, combining feedback generation and second-order correction into a single inference step. Empirical evaluations on code optimization across diverse domains (e.g., HumanEval, MBPP) demonstrate that TextBFGS significantly outperforms first-order baselines. It achieves superior pass rates with fewer model calls and exhibits strong cross-task transferability, thus establishes a mathematically grounded paradigm for efficient, memory-aware text optimization.