Diverse In-Context Example Selection After Decomposing Programs and Aligned Utterances Improves Semantic Parsing

📅 2025-04-04

📈 Citations: 0

✨ Influential: 0

career value

143K/year

🤖 AI Summary

In semantic parsing, the selection of in-context examples (ICEs) significantly impacts abstract syntax tree (AST)-guided program generation, yet existing methods lack adaptive mechanisms for structural alignment. To address this, we propose SCUD4ICL: a novel framework that introduces program-sentence co-fine-grained decomposition, jointly segmenting natural language utterances and ASTs into syntactically consistent fragments. We design an LLM-driven syntactic-constraint remapping mechanism to achieve fragment-level cross-modal alignment. Furthermore, we develop an AST-structure-aware diverse example selection algorithm, enabling unified retrieval of both full-program and fragment-level ICEs. Evaluated on mainstream benchmarks, SCUD4ICL substantially improves parsing accuracy—particularly under challenging settings including small-scale LLMs, large annotated AST pools, and low-resource languages. Our work establishes a new paradigm for example engineering in in-context learning for structured generation tasks.

Technology Category

Application Category

📝 Abstract

LLMs are increasingly used as seq2seq translators from natural language utterances to structured programs, a process called semantic interpretation. Unlike atomic labels or token sequences, programs are naturally represented as abstract syntax trees (ASTs). Such structured representation raises novel issues related to the design and selection of in-context examples (ICEs) presented to the LLM. We focus on decomposing the pool of available ICE trees into fragments, some of which may be better suited to solving the test instance. Next, we propose how to use (additional invocations of) an LLM with prompted syntax constraints to automatically map the fragments to corresponding utterances. Finally, we adapt and extend a recent method for diverse ICE selection to work with whole and fragmented ICE instances. We evaluate our system, SCUD4ICL, on popular diverse semantic parsing benchmarks, showing visible accuracy gains from our proposed decomposed diverse demonstration method. Benefits are particularly notable for smaller LLMs, ICE pools having larger labeled trees, and programs in lower resource languages.

Problem

Research questions and friction points this paper is trying to address.

Improving semantic parsing via decomposed in-context example selection

Automatically mapping AST fragments to aligned natural language utterances

Enhancing accuracy for smaller LLMs and low-resource language programs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decompose ICE trees into fragments for better selection

Use LLM with syntax constraints to map fragments

Extend diverse ICE selection to fragmented instances

🔎 Similar Papers

Rethinking Semantic Parsing for Large Language Models: Enhancing LLM Performance with Semantic Hints