🤖 AI Summary
In low-resource Burmese news classification, freezing pretrained encoders while fine-tuning only MLP classification heads suffers from limited representational capacity and suboptimal computational efficiency.
Method: This work introduces Kolmogorov–Arnold Networks (KANs) to this task for the first time, replacing fixed nonlinearities with learnable one-dimensional activation functions to enhance classifier expressivity. We systematically evaluate three KAN variants—FourierKAN, EfficientKAN, and FasterKAN—combined with TF-IDF, fastText, and mBERT embeddings.
Results: EfficientKAN with fastText achieves the highest F1 score (0.928); FasterKAN offers the best trade-off between accuracy and inference speed; Transformer-based KANs match or slightly surpass traditional MLPs in performance. This study establishes a novel, lightweight, expressive, and efficient classification head paradigm for low-resource NLP tasks.
📝 Abstract
In low-resource languages like Burmese, classification tasks often fine-tune only the final classification layer, keeping pre-trained encoder weights frozen. While Multi-Layer Perceptrons (MLPs) are commonly used, their fixed non-linearity can limit expressiveness and increase computational cost. This work explores Kolmogorov-Arnold Networks (KANs) as alternative classification heads, evaluating Fourier-based FourierKAN, Spline-based EfficientKAN, and Grid-based FasterKAN-across diverse embeddings including TF-IDF, fastText, and multilingual transformers (mBERT, Distil-mBERT). Experimental results show that KAN-based heads are competitive with or superior to MLPs. EfficientKAN with fastText achieved the highest F1-score (0.928), while FasterKAN offered the best trade-off between speed and accuracy. On transformer embeddings, EfficientKAN matched or slightly outperformed MLPs with mBERT (0.917 F1). These findings highlight KANs as expressive, efficient alternatives to MLPs for low-resource language classification.