🤖 AI Summary
This work addresses the lack of systematic evaluation of educational efficacy in existing video generation models, which predominantly focus on perceptual quality or general safety. The authors propose EduVideoBench—the first benchmark for evaluating educational video generation grounded in the Knowledge-Skills-Attitudes (KSA) framework from educational theory. By integrating KSA into generative model assessment, they establish a multidimensional, structured evaluation system for instructional appropriateness and educational safety. Through expert review and qualitative analysis, five state-of-the-art models are systematically assessed, revealing significant deficiencies in knowledge accuracy, skill demonstration, and attitudinal appropriateness. The findings indicate that misalignment in any single KSA dimension can render the generated content educationally ineffective, underscoring a substantial gap between current models and practical classroom deployment.
📝 Abstract
Video generation models (VGMs) are rapidly entering classrooms, yet existing benchmarks evaluate only perceptual quality, intrinsic faithfulness, generic safety, or video as a reasoning medium, and none assesses whether the outputs are educationally valid. In this work, we present EduVideoBench, the first balanced benchmark in the education domain, grounded in the Knowledge-Skills-Attitude (KSA) framework so that pedagogical adequacy and educational safety are evaluated jointly rather than as ad-hoc quality dimensions. Across five frontier VGMs, our results show substantial room for improvement across knowledge, skills, and attitude before they are classroom-ready. We complement this with a qualitative analysis of expert comments, finding that educational validity is multi-component, where a single misaligned element such as pacing, legibility, or notation can invalidate an otherwise correct video. We hope EduVideoBench will guide the development of VGMs that are pedagogically grounded and safe for the classroom.