From Cross-Task Examples to In-Task Prompts: A Graph-Based Pseudo-Labeling Framework for In-context Learning

📅 2025-10-28

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

High-quality in-context learning (ICL) exemplars are costly to obtain and labor-intensive to annotate. To address this, we propose a two-stage pseudo-demonstration construction framework: (1) cross-task pseudo-label generation, leveraging supervision signals from existing tasks to produce initial pseudo-labels for the target novel task; and (2) graph neural network (GNN)-based label propagation—lightweight, LLM-free—to automatically expand and refine the demonstration set. Our approach is the first to integrate cross-task supervision with efficient, graph-based label propagation, enabling high-quality ICL exemplar construction without updating LLM parameters or relying on LLM-generated annotations. Evaluated on five downstream tasks, it reduces annotation cost by 72% on average while maintaining inference accuracy comparable to fully supervised ICL. The method demonstrates strong scalability and practical applicability.

Technology Category

Application Category

📝 Abstract

The capability of in-context learning (ICL) enables large language models (LLMs) to perform novel tasks without parameter updates by conditioning on a few input-output examples. However, collecting high-quality examples for new or challenging tasks can be costly and labor-intensive. In this work, we propose a cost-efficient two-stage pipeline that reduces reliance on LLMs for data labeling. Our approach first leverages readily available cross-task examples to prompt an LLM and pseudo-label a small set of target task instances. We then introduce a graph-based label propagation method that spreads label information to the remaining target examples without additional LLM queries. The resulting fully pseudo-labeled dataset is used to construct in-task demonstrations for ICL. This pipeline combines the flexibility of cross-task supervision with the scalability of LLM-free propagation. Experiments across five tasks demonstrate that our method achieves strong performance while lowering labeling costs.

Problem

Research questions and friction points this paper is trying to address.

Reduces reliance on costly human-labeled examples for in-context learning

Propagates labels across target tasks without additional LLM queries

Constructs effective demonstrations using pseudo-labeled datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-based pseudo-labeling for in-context learning

Two-stage pipeline reduces LLM labeling dependency

Label propagation spreads annotations without LLM queries

🔎 Similar Papers

Learning vs Retrieval: The Role of In-Context Examples in Regression with LLMs