Modeling Annotator Disagreement with Demographic-Aware Experts and Synthetic Perspectives

📅 2025-08-04

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses two key challenges in subjective NLP tasks: difficulty in modeling annotator disagreement and insufficient coverage of demographic information. To this end, we propose the Demographic-Aware Mixture of Experts (DEM-MoE) model. Our method explicitly incorporates annotator demographic attributes into a mixture-of-experts architecture to better capture group-level behavioral differences; leverages zero-shot personality prompting with large language models to generate synthetic judgments annotated with demographic metadata, thereby mitigating annotation sparsity; and introduces a structured training strategy that jointly optimizes on both real and synthetic data. Experiments demonstrate that DEM-MoE achieves significant performance gains on high-disagreement datasets, yields more equitable outcomes across demographic subgroups, and attains moderate agreement (ρ ≈ 0.45) between synthetic and human annotations. Overall, the approach enhances dataset representativeness and improves model fairness without compromising accuracy.

Technology Category

Application Category

📝 Abstract

We present an approach to modeling annotator disagreement in subjective NLP tasks through both architectural and data-centric innovations. Our model, DEM-MoE (Demographic-Aware Mixture of Experts), routes inputs to expert subnetworks based on annotator demographics, enabling it to better represent structured, group-level variation compared to prior models. DEM-MoE consistently performs competitively across demographic groups, and shows especially strong results on datasets with high annotator disagreement. To address sparse demographic coverage, we test whether LLM-generated synthetic annotations via zero-shot persona prompting can be used for data imputation. We show these synthetic judgments align moderately well with human annotations on our data and offer a scalable way to potentially enrich training data. We then propose and evaluate approaches for blending real and synthetic data using strategies tailored to dataset structure. We find that the optimal strategies depend on dataset structure. Together, these contributions improve the representation of diverse perspectives.

Problem

Research questions and friction points this paper is trying to address.

Modeling annotator disagreement in subjective NLP tasks

Improving demographic-aware representation in annotation models

Using synthetic data to address sparse demographic coverage

Innovation

Methods, ideas, or system contributions that make the work stand out.

Demographic-Aware Mixture of Experts model

LLM-generated synthetic annotations

Blending real and synthetic data

🔎 Similar Papers

No similar papers found.

Authors to Follow