🤖 AI Summary
This study addresses the limitations of traditional lecture videos—namely their unidirectional, non-interactive nature—by proposing an in-video bidirectional interaction paradigm that transforms static videos into intelligent, personalized pedagogical media. Methodologically, it integrates an AI-cloned instructor (HeyGen), high-fidelity text-to-speech synthesis (ElevenLabs), and multimodal video anchoring with context-aware generation (leveraging GPT-5) to enable eight real-time functionalities: on-demand questioning at arbitrary timestamps, dynamic answer generation, just-in-time clarification, adaptive quizzing, and intelligent summarization, among others. Its key contribution is the novel “video-native interaction” architecture, which eliminates reliance on external interfaces or manual video editing. User studies (N=12) and expert evaluations (N=5) demonstrate statistically significant improvements in learning engagement (+42%) and conceptual understanding depth (p<0.01), validating the efficacy and feasibility of AI-augmented bidirectional instructional interaction.
📝 Abstract
We introduce Generative Lecture, a concept that makes existing lecture videos interactive through generative AI and AI clone instructors. By leveraging interactive avatars powered by HeyGen, ElevenLabs, and GPT-5, we embed an AI instructor into the video and augment the video content in response to students' questions. This allows students to personalize the lecture material, directly ask questions in the video, and receive tailored explanations generated and delivered by the AI-cloned instructor. From a design elicitation study (N=8), we identified four goals that guided the development of eight system features: 1) on-demand clarification, 2) enhanced visuals, 3) interactive example, 4) personalized explanation, 5) adaptive quiz, 6) study summary, 7) automatic highlight, and 8) adaptive break. We then conducted a user study (N=12) to evaluate the usability and effectiveness of the system and collected expert feedback (N=5). The results suggest that our system enables effective two-way communication and supports personalized learning.