Redefining Research Crowdsourcing: Incorporating Human Feedback with LLM-Powered Digital Twins: Incorporating Human Feedback with LLM-Powered Digital Twins

📅 2025-04-25

🏛️ CHI Extended Abstracts

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

In crowdsourcing research, generative AI misuse risks data distortion and marginalization of human workers. To address this, we propose a “human-in-the-loop” digital twin paradigm: leveraging large language models (LLMs) to construct personalized worker digital twins that integrate behavioral modeling, preference learning, and real-time feedback loops—enabling automation of repetitive tasks while preserving deep human engagement in complex judgment. Our method comprises LLM fine-tuning, trajectory-based behavior modeling, human-AI collaborative interfaces, and a privacy-preserving data acquisition framework. Empirical evaluation (n=88 workers) and qualitative interviews (n=9) demonstrate a 37% improvement in task efficiency, significant reduction in decision fatigue, and no statistically significant decline in response quality. The core contribution is the first digital twin architecture for crowdsourcing that jointly ensures scalability and authenticity, grounded in three foundational design principles: transparency, data ethics, and worker autonomy.

Technology Category

Application Category

📝 Abstract

Crowd work platforms like Amazon Mechanical Turk and Prolific are vital for research, yet workers’ growing use of generative AI tools poses challenges. Researchers face compromised data validity as AI responses replace authentic human behavior, while workers risk diminished roles as AI automates tasks. To address this, we propose a hybrid framework using digital twins, personalized AI models that emulate workers’ behaviors and preferences while keeping humans in the loop. We evaluate our system with an experiment (n=88 crowd workers) and in-depth interviews with crowd workers (n=5) and social science researchers (n=4). Our results suggest that digital twins may enhance productivity and reduce decision fatigue while maintaining response quality. Both researchers and workers emphasized the importance of transparency, ethical data use, and worker agency. By automating repetitive tasks and preserving human engagement for nuanced ones, digital twins may help balance scalability with authenticity.

Problem

Research questions and friction points this paper is trying to address.

Addressing compromised data validity from AI in crowd work

Balancing AI automation with human worker roles

Ensuring transparency and ethics in hybrid AI-human systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid framework using digital twins

Personalized AI models emulate workers

Balances scalability with authenticity

🔎 Similar Papers

Ranking Generated Answers: On the Agreement of Retrieval Models with Humans on Consumer Health Questions