Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals' Subjective Text Perceptions

📅 2025-02-28

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work investigates whether large language models (LLMs) can systematically model associations between demographic attributes and subjective text annotation behavior. Using datasets from five standardized subjective annotation tasks, the study employs fine-grained supervised fine-tuning, demographic-conditioned prompting, cross-task generalization evaluation, and ablation analysis. Results show that performance gains primarily stem from memorizing and fitting individual annotator behavior—not from learning generalizable demographic regularities. Crucially, across all five tasks, LLMs exhibit no robust or interpretable group-level patterns in response to demographic prompts. This is the first systematic examination challenging the feasibility of treating LLMs as “demographically aware annotators.” It reveals a critical risk of individual-level overfitting in current paradigms for subjective annotation modeling and provides both methodological caution and an empirical benchmark for trustworthy subjective modeling.

Technology Category

Application Category

📝 Abstract

People naturally vary in their annotations for subjective questions and some of this variation is thought to be due to the person's sociodemographic characteristics. LLMs have also been used to label data, but recent work has shown that models perform poorly when prompted with sociodemographic attributes, suggesting limited inherent sociodemographic knowledge. Here, we ask whether LLMs can be trained to be accurate sociodemographic models of annotator variation. Using a curated dataset of five tasks with standardized sociodemographics, we show that models do improve in sociodemographic prompting when trained but that this performance gain is largely due to models learning annotator-specific behaviour rather than sociodemographic patterns. Across all tasks, our results suggest that models learn little meaningful connection between sociodemographics and annotation, raising doubts about the current use of LLMs for simulating sociodemographic variation and behaviour.

Problem

Research questions and friction points this paper is trying to address.

Improving LLMs to predict subjective text perceptions

Training LLMs with sociodemographic data for better accuracy

Assessing LLMs' ability to simulate sociodemographic variation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuning LLMs for subjective text perception prediction

Training models with sociodemographic data for improved accuracy

Analyzing annotator-specific behavior over sociodemographic patterns

🔎 Similar Papers

No similar papers found.