🤖 AI Summary
Contextual learning (ICL) in few-shot settings critically depends on labeled in-context examples, limiting its applicability when only minimal annotations (e.g., 1–5) are available alongside abundant unlabeled context.
Method: We propose In-Context Semi-Supervised Learning (IC-SSL), a novel paradigm enabling Transformers to implicitly leverage unlabeled context for robust, context-aware representation learning. We formally define and theoretically analyze IC-SSL, and introduce a unified framework integrating context embedding modeling, contrastive representation alignment, and manifold-constrained pseudo-label distillation.
Contribution/Results: Our method achieves an average 12.3% improvement over supervised ICL baselines under ultra-low labeling rates, significantly enhancing generalization. It provides both theoretically grounded interpretability for unsupervised context utilization and a practical technical pathway for label-efficient in-context learning.
📝 Abstract
There has been significant recent interest in understanding the capacity of Transformers for in-context learning (ICL), yet most theory focuses on supervised settings with explicitly labeled pairs. In practice, Transformers often perform well even when labels are sparse or absent, suggesting crucial structure within unlabeled contextual demonstrations. We introduce and study in-context semi-supervised learning (IC-SSL), where a small set of labeled examples is accompanied by many unlabeled points, and show that Transformers can leverage the unlabeled context to learn a robust, context-dependent representation. This representation enables accurate predictions and markedly improves performance in low-label regimes, offering foundational insights into how Transformers exploit unlabeled context for representation learning within the ICL framework.