🤖 AI Summary
Current slide-based teaching evaluation methods suffer from limitations in scalability, contextual modeling, and fairness, necessitating an automated, multimodal knowledge-aware assessment framework. This paper introduces LecEval—the first learnable evaluation metric grounded in Mayer’s Cognitive Theory of Multimedia Learning—quantifying instructional effectiveness along four dimensions: content relevance, presentation clarity, logical structure, and audience engagement. Innovatively, we explicitly encode cognitive principles into a multi-task supervised learning framework that jointly models textual and visual slide features. We further construct the first large-scale, fine-grained human-annotated dataset (2,000+ slides across 50+ courses). Experiments demonstrate that LecEval significantly outperforms conventional metrics (e.g., BLEU, BERTScore) and LLM-based evaluators, achieving strong agreement with human judgments (average Pearson r > 0.82). Both code and dataset are publicly released.
📝 Abstract
Evaluating the quality of slide-based multimedia instruction is challenging. Existing methods like manual assessment, reference-based metrics, and large language model evaluators face limitations in scalability, context capture, or bias. In this paper, we introduce LecEval, an automated metric grounded in Mayer's Cognitive Theory of Multimedia Learning, to evaluate multimodal knowledge acquisition in slide-based learning. LecEval assesses effectiveness using four rubrics: Content Relevance (CR), Expressive Clarity (EC), Logical Structure (LS), and Audience Engagement (AE). We curate a large-scale dataset of over 2,000 slides from more than 50 online course videos, annotated with fine-grained human ratings across these rubrics. A model trained on this dataset demonstrates superior accuracy and adaptability compared to existing metrics, bridging the gap between automated and human assessments. We release our dataset and toolkits at https://github.com/JoylimJY/LecEval.