🤖 AI Summary
This study addresses the lack of empirical evidence regarding the effectiveness and acceptance of large language model (LLM)-driven virtual teaching assistants (VTAs) in authentic classroom settings. We conducted the first large-scale classroom deployment of a VTA in a graduate-level AI programming course with 477 students. Our evaluation integrated three rounds of student surveys, analysis of 3,869 human–VTA interaction sequences, and a controlled comparison with human teaching assistants, augmented by multi-turn dialogue modeling, educational question categorization, and behavioral pattern mining. Key contributions include: (1) empirical identification of increasing student sensitivity to response accuracy over time and observation of engagement decay; (2) validation of VTA feasibility for foundational programming Q&A, along with characterization of high-frequency question types; and (3) open-sourcing of a fully reproducible VTA system, thereby facilitating a paradigm shift in educational AI research—from technical proof-of-concept to evidence-driven, classroom-validated inquiry.
📝 Abstract
Virtual Teaching Assistants (VTAs) powered by Large Language Models (LLMs) have the potential to enhance student learning by providing instant feedback and facilitating multi-turn interactions. However, empirical studies on their effectiveness and acceptance in real-world classrooms are limited, leaving their practical impact uncertain. In this study, we develop an LLM-based VTA and deploy it in an introductory AI programming course with 477 graduate students. To assess how student perceptions of the VTA's performance evolve over time, we conduct three rounds of comprehensive surveys at different stages of the course. Additionally, we analyze 3,869 student--VTA interaction pairs to identify common question types and engagement patterns. We then compare these interactions with traditional student--human instructor interactions to evaluate the VTA's role in the learning process. Through a large-scale empirical study and interaction analysis, we assess the feasibility of deploying VTAs in real-world classrooms and identify key challenges for broader adoption. Finally, we release the source code of our VTA system, fostering future advancements in AI-driven education: exttt{https://github.com/sean0042/VTA}.