🤖 AI Summary
This work addresses the problem of intent ambiguity in user-provided input-output examples for program synthesis. We propose PraX, the first framework that integrates self-play (Listener-Speaker) with computational pragmatic inference—specifically, a neuralized amortized variant of the Rational Speech Act (RSA) model—to generate counterfactual pragmatic examples. PraX requires no human annotations and automatically constructs high-informativeness training samples via pragmatically driven generation, substantially improving the zero-shot intent inference capability of neural program synthesizers. Evaluated on regex synthesis, PraX achieves a 51% relative improvement (absolute +23%) over strong baselines and matches the performance of models trained on fully supervised, human-annotated pragmatic data. These results validate the effectiveness and scalability of jointly leveraging pragmatic modeling and self-play for sample generation in program synthesis.
📝 Abstract
Programming-by-example is the task of synthesizing a program that is consistent with a set of user-provided input-output examples. As examples are often an under-specification of one's intent, a good synthesizer must choose the intended program from the many that are consistent with the given set of examples. Prior work frames program synthesis as a cooperative game between a listener (that synthesizes programs) and a speaker (a user choosing examples), and shows that models of computational pragmatic inference are effective in choosing the user intended programs. However, these models require counterfactual reasoning over a large set of programs and examples, which is infeasible in realistic program spaces. In this paper, we propose PraX, a novel way to amortize this search with neural networks. We sample pairs of programs and examples via self-play between listener and speaker models, and use pragmatic inference to choose informative training examples from this sample. We then use the informative dataset to train models to improve the synthesizer's ability to disambiguate user-provided examples without human supervision. We validate PraX on the challenging task of synthesizing regular expressions from example strings, and find that our method (1) outperforms models trained without choosing pragmatic examples by 23% (a 51% relative increase) (2) matches the performance of supervised learning on a dataset of pragmatic examples provided by humans, despite using no human data in training.