Bringing Pedagogy into Focus: Evaluating Virtual Teaching Assistants' Question-Answering in Asynchronous Learning Environments

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Current virtual teaching assistants (VTAs) lack educationally grounded evaluation frameworks for assessing question-answering quality in asynchronous learning environments, hindering rigorous measurement and cross-system comparison of pedagogical effectiveness. To address this, we propose the first learning-science–informed VTA QA evaluation framework, specifically designed for asynchronous forum discussions. It defines multidimensional pedagogical metrics—including cognitive scaffolding, feedback appropriateness, and reflection promotion—grounded in established educational theory. Leveraging expert-annotated data, we train supervised classifiers for automated assessment. Experimental validation confirms model efficacy, identifies key accuracy determinants (e.g., discourse context modeling), and reveals generalization bottlenecks across domains and task types. This work pioneers the systematic integration of educational theory into VTA evaluation, substantially enhancing interpretability, comparability, and pedagogical relevance. It establishes a reproducible, theory-driven assessment paradigm for AI-enabled educational technologies.

Technology Category

Application Category

📝 Abstract

Asynchronous learning environments (ALEs) are widely adopted for formal and informal learning, but timely and personalized support is often limited. In this context, Virtual Teaching Assistants (VTAs) can potentially reduce the workload of instructors, but rigorous and pedagogically sound evaluation is essential. Existing assessments often rely on surface-level metrics and lack sufficient grounding in educational theories, making it difficult to meaningfully compare the pedagogical effectiveness of different VTA systems. To bridge this gap, we propose an evaluation framework rooted in learning sciences and tailored to asynchronous forum discussions, a common VTA deployment context in ALE. We construct classifiers using expert annotations of VTA responses on a diverse set of forum posts. We evaluate the effectiveness of our classifiers, identifying approaches that improve accuracy as well as challenges that hinder generalization. Our work establishes a foundation for theory-driven evaluation of VTA systems, paving the way for more pedagogically effective AI in education.

Problem

Research questions and friction points this paper is trying to address.

Evaluating Virtual Teaching Assistants' pedagogical effectiveness in asynchronous learning

Addressing lack of educational theory grounding in current VTA assessments

Developing theory-driven evaluation framework for AI teaching assistants

Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposed evaluation framework grounded in learning sciences

Constructed classifiers using expert annotations of responses

Identified approaches improving accuracy and generalization challenges

🔎 Similar Papers

VCEval: Rethinking What is a Good Educational Video and How to Automatically Evaluate It