Choose, Don't Label: Multiple-Choice Query Synthesis for Program Disambiguation

📅 2026-04-09
📈 Citations: 0
✹ Influential: 0
📄 PDF

career value

176K/year
đŸ€– AI Summary
This work addresses the challenge of accurately interpreting high-level code specifications, which are often ambiguous and difficult to understand. Traditional interactive clarification methods relying on user-provided examples are error-prone and inefficient. To overcome this, the paper introduces a novel active learning paradigm based on multiple-choice queries: the system automatically generates a small set of semantically precise Hoare triples as candidate behavioral options, enabling users to efficiently select the intended behavior. The authors design and implement Socrates, a tool that integrates Hoare logic, program clustering, and query synthesis to formally optimize both the informativeness and interpretability of queries. Experiments across four symbolic and neurosymbolic tasks demonstrate that Socrates produces intuitive, easy-to-answer queries that converge rapidly, significantly outperforming existing approaches in accurately identifying target programs while maintaining strong computational efficiency.

Technology Category

Application Category

📝 Abstract
High-level specifications of code are inherently ambiguous, and prior systems have explored interactive techniques to help users clarify their intent and resolve such ambiguities. However, most existing approaches elicit supervision through labeled examples, which are often error-prone and may fail to capture user intent. This paper introduces a new active learning paradigm for program disambiguation based on multiple-choice queries. In this paradigm, the system presents a small set of high-level behaviors as multiple-choice options, and the user simply selects the intended one. Technically, each answer option corresponds to a Hoare triple that characterizes a cluster of semantically similar candidate programs. This formulation enables formal reasoning about the informativeness and interpretability of queries, and supports systematic construction of optimal queries. Building on this insight, we develop a new active learning algorithm and implement it in a tool called Socrates, which automatically synthesizes informative multiple-choice queries for program disambiguation. We evaluate Socrates across four domains spanning both symbolic and neurosymbolic settings and show that it produces intuitive, easy-to-answer queries and achieves efficient convergence. Most importantly, Socrates identifies the intended program more reliably than existing methods, while maintaining competitive runtime performance.
Problem

Research questions and friction points this paper is trying to address.

program disambiguation
active learning
multiple-choice queries
user intent
code specification
Innovation

Methods, ideas, or system contributions that make the work stand out.

multiple-choice query
program disambiguation
active learning
Hoare triple
query synthesis
🔎 Similar Papers
2024-08-22arXiv.orgCitations: 2