Opt-ICL at LeWiDi-2025: Maximizing In-Context Signal from Rater Examples via Meta-Learning

📅 2025-10-08

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This study addresses the challenge of modeling human annotation variability in natural language processing—arising from subjectivity, ambiguity, and legitimate annotator disagreement. We propose a two-stage contextual meta-learning framework that integrates general-purpose post-training with task-distribution-specific contextual meta-learning. Crucially, annotator-specific examples are explicitly incorporated into in-context learning to model individual preferences, and model scaling strategies are employed to enhance generalization. Our key contribution is the empirical discovery and validation—via ablation studies—that annotator examples play a decisive role in contextual learning for subjective tasks. Evaluated on two highly subjective tasks in the LeWiDi-2025 competition, our method achieves first place in both, significantly outperforming all baselines. Results demonstrate its effectiveness in capturing the diversity of human judgments and improving model robustness and adaptability to subjective linguistic phenomena.

Technology Category

Application Category

📝 Abstract

Many natural language processing (NLP) tasks involve subjectivity, ambiguity, or legitimate disagreement between annotators. In this paper, we outline our system for modeling human variation. Our system leverages language models' (LLMs) in-context learning abilities, along with a two-step meta-learning training procedure for 1) post-training on many datasets requiring in-context learning and 2) specializing the model via in-context meta-learning to the particular data distribution of interest. We also evaluate the performance of our system submission to the Learning With Disagreements (LeWiDi) competition, where it was the overall winner on both tasks. Additionally, we perform an ablation study to measure the importance of each system component. We find that including rater examples in-context is crucial for our system's performance, dataset-specific fine-tuning is helpful on the larger datasets, post-training on other in-context datasets is helpful on one of the competition datasets, and that performance improves with model scale.

Problem

Research questions and friction points this paper is trying to address.

Modeling human variation in subjective NLP tasks with annotator disagreement

Leveraging meta-learning to specialize models for specific data distributions

Maximizing performance by incorporating rater examples through in-context learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Meta-learning optimizes in-context learning from rater examples

Two-step training combines post-training and dataset specialization

System leverages in-context signals to model human annotation variation

🔎 Similar Papers

Disentangling Latent Shifts of In-Context Learning Through Self-Training