Multi-Perspective LLM Annotations for Valid Analyses in Subjective Tasks

📅 2026-03-22

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

This work addresses the challenge that large language models (LLMs) often fail to fairly represent diverse population perspectives in subjective text annotation tasks, largely due to the flawed assumption of a single ground truth. To overcome this limitation, the authors propose a perspective-driven reasoning framework that explicitly models annotation distributions across multiple demographic or social groups, thereby abandoning the notion of a unique true label. Furthermore, they introduce an adaptive human annotation sampling strategy that prioritizes groups for which the LLM performs poorly, given a fixed annotation budget. Evaluated on politeness and offensiveness scoring tasks, the proposed approach significantly improves annotation quality for underrepresented or hard-to-model groups while maintaining broad overall population coverage, outperforming uniform sampling baselines.

Technology Category

Application Category

📝 Abstract

Large language models are increasingly used to annotate texts, but their outputs reflect some human perspectives better than others. Existing methods for correcting LLM annotation error assume a single ground truth. However, this assumption fails in subjective tasks where disagreement across demographic groups is meaningful. Here we introduce Perspective-Driven Inference, a method that treats the distribution of annotations across groups as the quantity of interest, and estimates it using a small human annotation budget. We contribute an adaptive sampling strategy that concentrates human annotation effort on groups where LLM proxies are least accurate. We evaluate on politeness and offensiveness rating tasks, showing targeted improvements for harder-to-model demographic groups relative to uniform sampling baselines, while maintaining coverage.

Problem

Research questions and friction points this paper is trying to address.

subjective tasks

LLM annotation bias

demographic disagreement

annotation fairness

multi-perspective evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Perspective-Driven Inference

LLM annotation bias

adaptive sampling