🤖 AI Summary
In educational e-book platforms, reading content data (e.g., lecture slides) are often overlooked and poorly integrated with behavioral data. To address this, we propose LECTOR: the first model to leverage semantically structured instructional content summaries for educational prediction. Guided by a course knowledge graph, LECTOR employs a multi-granularity summarization framework that jointly integrates topic modeling and sequence labeling to enable interpretable, joint modeling of content and behavioral data—thereby filling a critical gap in content-aware educational data mining. On a dataset of 2,255 lecture slide summaries, LECTOR achieves +5% F1 (automated evaluation) and +21% F1 (human evaluation) over baselines. When incorporated as features, LECTOR significantly improves low-performance student prediction and demonstrates practical utility in personalized learning analytics and intervention design.
📝 Abstract
Educational e-book platforms provide valuable information to teachers and researchers through two main sources: reading activity data and reading content data. While reading activity data is commonly used to analyze learning strategies and predict low-performing students, reading content data is often overlooked in these analyses. To address this gap, this study proposes LECTOR (Lecture slides and Topic Relationships), a model that summarizes information from reading content in a format that can be easily integrated with reading activity data. Our first experiment compared LECTOR to representative Natural Language Processing (NLP) models in extracting key information from 2,255 lecture slides, showing an average improvement of 5% in F1-score. These results were further validated through a human evaluation involving 28 students, which showed an average improvement of 21% in F1-score over a model predominantly used in current educational tools. Our second experiment compared reading preferences extracted by LECTOR with traditional reading activity data in predicting low-performing students using 600,712 logs from 218 students. The results showed a tendency to improve the predictive performance by integrating LECTOR. Finally, we proposed examples showing the potential application of the reading preferences extracted by LECTOR in designing personalized interventions for students.