🤖 AI Summary
This study addresses student academic performance prediction by proposing a lightweight, interpretable modeling framework based on a multilayer perceptron classifier (MLPC). Methodologically, it integrates heterogeneous features—including behavioral, academic, and demographic attributes—employs recursive feature elimination (RFE) for optimal feature selection, and evaluates model stability via 10-fold cross-validation. Crucially, it innovatively incorporates eXplainable AI (XAI) techniques—specifically SHAP and LIME—to systematically validate MLPC’s data efficiency and interpretability in educational contexts. Experimental results demonstrate a test accuracy of 86.46% (with a 10-fold CV mean of 79.58%), significantly outperforming conventional models. These findings confirm that lightweight neural networks exhibit high efficacy, robustness, and generalizability in small-sample educational settings. The framework establishes a new paradigm for intelligent educational diagnostics that balances predictive performance with model transparency and trustworthiness.
📝 Abstract
This research investigates the use of machine learning methods to forecast students’ academic performance in a school setting. Students’ data with behavioral, academic, and demographic details were used in implementations with standard classical machine learning models including multi-layer perceptron classifier (MLPC). MLPC obtained 86.46% maximum accuracy for test set across all implementations while for train set, it was 99.45%. Under 10-fold cross validation, MLPC obtained 79.58% average accuracy for test set while for train set, it was 99.65%. MLP’s better performance over other machine learning models strongly suggest the potential use of neural networks as data-efficient models. Feature selection approach played a crucial role in improving the performance and multiple evaluation approaches were used in order to compare with existing literature. Explainable machine learning methods were utilized to demystify the black box models and to validate the feature selection approach.