Program Synthesis via Test-Time Transduction

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Existing program synthesis methods exhibit limited robustness under few-shot training and on out-of-distribution (OOD) test inputs. To address this, we propose a test-time transductive program synthesis framework that reformulates synthesis as an active learning process over a finite hypothesis space. Our approach dynamically prunes candidate programs using test inputs, initializes hypotheses via large language models, selects maximally informative queries using a greedy max-min algorithm, and employs an input-output consistency–based hypothesis resolution mechanism for efficient search. This design significantly improves generalization to edge cases and synthesis robustness. Empirical evaluation on the Playgol and MBPP+ benchmarks demonstrates superior accuracy and query efficiency compared to state-of-the-art methods.

Technology Category

Application Category

📝 Abstract

We introduce transductive program synthesis, a new formulation of the program synthesis task that explicitly leverages test inputs during synthesis. While prior approaches to program synthesis--whether based on natural language descriptions or input-output examples--typically aim to generalize from training examples, they often struggle with robustness, especially in real-world settings where training examples are limited and test inputs involve various edge cases. To address this, we propose a novel framework that improves robustness by treating synthesis as an active learning over a finite hypothesis class defined by programs' outputs. We use an LLM to predict outputs for selected test inputs and eliminate inconsistent hypotheses, where the inputs are chosen via a greedy maximin algorithm to minimize the number of LLM queries required. We evaluate our approach on two real-world datasets: Playgol, a string transformation benchmark, and MBPP+, a Python code generation benchmark. We demonstrate that our method significantly improves program synthesis in both accuracy and efficiency. We release our code at https://github.com/klee972/SYNTRA.

Problem

Research questions and friction points this paper is trying to address.

Improving program synthesis robustness with limited training examples

Addressing edge cases in test inputs during program generation

Reducing LLM queries while maintaining synthesis accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transductive synthesis leverages test inputs

Active learning over finite hypothesis class

Greedy maximin algorithm minimizes LLM queries

🔎 Similar Papers

Relational decomposition for program synthesis