🤖 AI Summary
This study addresses the lack of a systematic benchmark for Burmese handwritten digit recognition, which has hindered related AI research. The authors establish the first cross-paradigm benchmark on the myMNIST dataset, comprehensively evaluating eleven models—including CNNs, MLPs, LSTMs, GRUs, Transformers, FastKAN, EfficientKAN, JEM, and PETNN variants with various activation functions. Experimental results demonstrate that CNNs achieve the highest performance (accuracy: 0.9970, F1-score: 0.9959), closely followed by PETNN with GELU activation (accuracy: 0.9966), both significantly outperforming KAN-based models and Transformers; JEM also shows competitive results. This work not only fills a critical gap in Burmese script recognition benchmarks but also highlights the potential of physics-inspired PETNN models for regional script recognition and quantifies the performance gap between energy-inspired and true energy-based models.
📝 Abstract
We present the first systematic benchmark on myMNIST (formerly BHDD), a publicly available Burmese handwritten digit dataset important for Myanmar NLP/AI research. We evaluate eleven architectures spanning classical deep learning models (Multi-Layer Perceptron, Convolutional Neural Network, Long Short-Term Memory, Gated Recurrent Unit, Transformer), recent alternatives (FastKAN, EfficientKAN), an energy-based model (JEM), and physics-inspired PETNN variants (Sigmoid, GELU, SiLU). Using Precision, Recall, F1-Score, and Accuracy as evaluation metrics, our results show that the CNN remains a strong baseline, achieving the best overall scores (F1 = 0.9959, Accuracy = 0.9970). The PETNN (GELU) model closely follows (F1 = 0.9955, Accuracy = 0.9966), outperforming LSTM, GRU, Transformer, and KAN variants. JEM, representing energy-based modeling, performs competitively (F1 = 0.9944, Accuracy = 0.9958). KAN-based models (FastKAN, EfficientKAN) trail the top performers but provide a meaningful alternative baseline (Accuracy ~0.992). These findings (i) establish reproducible baselines for myMNIST across diverse modeling paradigms, (ii) highlight PETNN's strong performance relative to classical and Transformer-based models, and (iii) quantify the gap between energy-inspired PETNNs and a true energy-based model (JEM). We release this benchmark to facilitate future research on Myanmar digit recognition and to encourage broader evaluation of emerging architectures on regional scripts.