LLM-ERM: Sample-Efficient Program Learning via LLM-Guided Search

📅 2025-10-16

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Program learning faces a fundamental trade-off between sample efficiency and computational feasibility—exhaustive search incurs prohibitive computational cost, while gradient-based methods (e.g., SGD-trained Transformers) require massive datasets and suffer from severe overfitting. Method: This paper proposes an LLM-guided generate-and-validate framework. It leverages pre-trained, reasoning-enhanced large language models to generate concise program candidates; optimal hypotheses are selected via compilation, execution, and hold-out set validation—entirely without gradients or feedback signals—thus avoiding both exponential search and gradient-optimization pitfalls. Results: On tasks including parity variants, pattern matching, and primality testing, the method achieves high accuracy using only 200 samples. In contrast, SGD-trained Transformers exhibit significant overfitting even with 100,000 samples. This approach substantially improves practicality and scalability of few-shot program learning.

Technology Category

Application Category

📝 Abstract

We seek algorithms for program learning that are both sample-efficient and computationally feasible. Classical results show that targets admitting short program descriptions (e.g., with short ``python code'') can be learned with a ``small'' number of examples (scaling with the size of the code) via length-first program enumeration, but the search is exponential in description length. Consequently, Gradient-based training avoids this cost yet can require exponentially many samples on certain short-program families. To address this gap, we introduce LLM-ERM, a propose-and-verify framework that replaces exhaustive enumeration with an LLM-guided search over candidate programs while retaining ERM-style selection on held-out data. Specifically, we draw $k$ candidates with a pretrained reasoning-augmented LLM, compile and check each on the data, and return the best verified hypothesis, with no feedback, adaptivity, or gradients. Theoretically, we show that coordinate-wise online mini-batch SGD requires many samples to learn certain short programs. {em Empirically, LLM-ERM solves tasks such as parity variants, pattern matching, and primality testing with as few as 200 samples, while SGD-trained transformers overfit even with 100,000 samples}. These results indicate that language-guided program synthesis recovers much of the statistical efficiency of finite-class ERM while remaining computationally tractable, offering a practical route to learning succinct hypotheses beyond the reach of gradient-based training.

Problem

Research questions and friction points this paper is trying to address.

Addresses sample-efficient program learning with computational feasibility

Replaces exhaustive enumeration with LLM-guided search over candidates

Solves tasks like parity variants and primality testing with few samples

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided search replaces exhaustive program enumeration

Propose-and-verify framework selects best verified hypothesis

Language-guided synthesis maintains statistical efficiency computationally

🔎 Similar Papers

Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search