🤖 AI Summary
This study addresses the limited understanding of how domain experts can effectively engage in the design and evaluation of large language models (LLMs) within complex professional domains. Through a 12-week ethnographic investigation involving field observations, semi-structured interviews, and qualitative analysis, the research examines collaboration dynamics between experts and developers in a pedagogical chatbot development team. It identifies four key practices: adaptive data collection strategies, knowledge augmentation techniques under constrained expert input, co-constructed evaluation criteria, and a hybrid assessment framework integrating expert, developer, and LLM perspectives. The study also highlights core challenges in expert involvement—such as insufficient motivation, lack of trust, and ambiguous collaboration structures—and proposes future workflow designs to support evolving expert roles, including enhanced AI literacy and transparent consent mechanisms, thereby underscoring the indispensable value of expert knowledge in LLM development.
📝 Abstract
Large Language Models (LLMs) are increasingly developed for use in complex professional domains, yet little is known about how teams design and evaluate these systems in practice. This paper examines the challenges and trade-offs in LLM development through a 12-week ethnographic study of a team building a pedagogical chatbot. The researcher observed design and evaluation activities and conducted interviews with both developers and domain experts. Analysis revealed four key practices: creating workarounds for data collection, turning to augmentation when expert input was limited, co-developing evaluation criteria with experts, and adopting hybrid expert-developer-LLM evaluation strategies. These practices show how teams made strategic decisions under constraints and demonstrate the central role of domain expertise in shaping the system. Challenges included expert motivation and trust, difficulties structuring participatory design, and questions around ownership and integration of expert knowledge. We propose design opportunities for future LLM development workflows that emphasize AI literacy, transparent consent, and frameworks recognizing evolving expert roles.