Wisdom of the Crowd, Without the Crowd: A Socratic LLM for Asynchronous Deliberation on Perspectivist Data

📅 2025-08-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Asynchronous crowdsourced annotation struggles to preserve diverse perspectives, and conventional aggregation methods fail for complex, ambiguous tasks—e.g., irony detection and relation extraction. Method: We propose the first Socratic large language model (LLM) framework for asynchronous deliberation. It employs LLM agents to simulate multi-turn, reflective dialogues grounded in Socratic questioning, guiding annotators to autonomously clarify positions and reconcile disagreements without real-time coordination. Contribution/Results: On relation extraction, our framework significantly improves annotation accuracy and confidence scores. Qualitative analysis confirms it effectively elicits reasoned argumentation, enhancing both data diversity and representativeness. By enabling scalable, interpretable, and epistemically rich deliberation, this work establishes a novel paradigm for complex semantic annotation—bridging the gap between human judgment, model reasoning, and crowd intelligence in asynchronous settings.

Technology Category

Application Category

📝 Abstract
Data annotation underpins the success of modern AI, but the aggregation of crowd-collected datasets can harm the preservation of diverse perspectives in data. Difficult and ambiguous tasks cannot easily be collapsed into unitary labels. Prior work has shown that deliberation and discussion improve data quality and preserve diverse perspectives -- however, synchronous deliberation through crowdsourcing platforms is time-intensive and costly. In this work, we create a Socratic dialog system using Large Language Models (LLMs) to act as a deliberation partner in place of other crowdworkers. Against a benchmark of synchronous deliberation on two tasks (Sarcasm and Relation detection), our Socratic LLM encouraged participants to consider alternate annotation perspectives, update their labels as needed (with higher confidence), and resulted in higher annotation accuracy (for the Relation task where ground truth is available). Qualitative findings show that our agent's Socratic approach was effective at encouraging reasoned arguments from our participants, and that the intervention was well-received. Our methodology lays the groundwork for building scalable systems that preserve individual perspectives in generating more representative datasets.
Problem

Research questions and friction points this paper is trying to address.

Preserving diverse perspectives in AI data annotation
Reducing cost and time of synchronous crowd deliberation
Improving annotation accuracy through Socratic LLM dialogues
Innovation

Methods, ideas, or system contributions that make the work stand out.

Socratic LLM for asynchronous deliberation
Preserves diverse perspectives in data
Improves annotation accuracy and confidence
🔎 Similar Papers
No similar papers found.
M
Malik Khadar
University of Minnesota, Department of Computer Science & Engineering, Minneapolis, USA
D
Daniel Runningen
University of Minnesota, Department of Computer Science & Engineering, Minneapolis, USA
J
Julia Tang
University of Minnesota, Department of Computer Science & Engineering, Minneapolis, USA
Stevie Chancellor
Stevie Chancellor
Assistant Professor of Computer Science & Engineering, University of Minnesota
Social ComputingHCIOnline Communitieshuman centered machine learning
Harmanpreet Kaur
Harmanpreet Kaur
University of Minnesota
Human-Computer InteractionInterpretable ML