🤖 AI Summary
High-quality in-context learning (ICL) exemplars are costly to obtain and labor-intensive to annotate. To address this, we propose a two-stage pseudo-demonstration construction framework: (1) cross-task pseudo-label generation, leveraging supervision signals from existing tasks to produce initial pseudo-labels for the target novel task; and (2) graph neural network (GNN)-based label propagation—lightweight, LLM-free—to automatically expand and refine the demonstration set. Our approach is the first to integrate cross-task supervision with efficient, graph-based label propagation, enabling high-quality ICL exemplar construction without updating LLM parameters or relying on LLM-generated annotations. Evaluated on five downstream tasks, it reduces annotation cost by 72% on average while maintaining inference accuracy comparable to fully supervised ICL. The method demonstrates strong scalability and practical applicability.
📝 Abstract
The capability of in-context learning (ICL) enables large language models (LLMs) to perform novel tasks without parameter updates by conditioning on a few input-output examples. However, collecting high-quality examples for new or challenging tasks can be costly and labor-intensive. In this work, we propose a cost-efficient two-stage pipeline that reduces reliance on LLMs for data labeling. Our approach first leverages readily available cross-task examples to prompt an LLM and pseudo-label a small set of target task instances. We then introduce a graph-based label propagation method that spreads label information to the remaining target examples without additional LLM queries. The resulting fully pseudo-labeled dataset is used to construct in-task demonstrations for ICL. This pipeline combines the flexibility of cross-task supervision with the scalability of LLM-free propagation. Experiments across five tasks demonstrate that our method achieves strong performance while lowering labeling costs.