Early Prediction of Student Performance Using Bayesian Updating with Informative Priors Across Cohorts

📅 2026-04-21

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This study addresses the instability of traditional single-cohort student modeling approaches in cross-cohort prediction, which often fail to reliably identify at-risk students early in the semester. To overcome this limitation, the authors propose a novel Bayesian updating framework that leverages the posterior distribution from the previous cohort as an informative prior. Integrating six self-regulated learning indicators derived from digital traces, the framework employs Bayesian linear, logistic, and ordinal regression models to generate dynamic weekly predictions. Empirical results demonstrate substantial improvements in early classification performance during weeks two to three: misclassification rates decrease by 42% and 22% for ordinal and logistic regression, respectively, with a 38% reduction in false negatives. By week four, prediction accuracy reaches 0.77, significantly enhancing the model’s cross-cohort robustness and practical utility.

Technology Category

Application Category

📝 Abstract

Early identification of at risk students in higher education depends on predictive models that maintain accuracy across successive cohorts -- a requirement that single-cohort modeling approaches fail to meet. This study evaluates Bayesian updating with informative priors from a previous cohort to improve cross-cohort prediction robustness using digital trace data. We fit weekly Bayesian linear, logistic, and ordinal regression models with either uninformative default priors or informative priors derived from posterior distributions of a preceding cohort. Models were applied to six weekly self-regulated learning (SRL)-aligned engagement indicators from two consecutive cohorts of students in a blended first-year mathematics course (N1 = 307; N2 = 323). Outcomes were exam points, final grades, and a binary at risk indicator. The models were evaluated weekly based on accuracy, sensitivity, and RMSE. In the source cohort, performance was already substantial by week 6. In the target cohort, informative priors improved early classification: Logistic models with priors reduced misclassification by 22% and false negatives by 38% in week 3 relative to the uninformative default. Ordinal models with priors similarly showed the strongest improvements in early weeks, reducing misclassification by 42% in week 2 and reaching an accuracy of .77 by week 4. Linear models showed little benefit from prior information. These findings demonstrate that Bayesian updating is a viable method for improving early classification performance across cohorts, with gains concentrated in the early weeks of the semester when current-cohort data are scarce.

Problem

Research questions and friction points this paper is trying to address.

early prediction

student performance

cross-cohort generalization

at-risk students

predictive modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian updating

informative priors

cross-cohort prediction