🤖 AI Summary
This paper addresses three challenging sentence classification tasks for Burmese: imbalanced binary hate speech detection, balanced multi-class news categorization, and imbalanced multi-class ethnic language identification. We propose KAConvText, the first text classification model to incorporate Kolmogorov–Arnold convolution (KA-Conv), integrated with fine-tuned fastText embeddings (CBOW/Skip-gram) and an interpretable Kolmogorov–Arnold Network (KAN)-based classifier head; an MLP-based baseline is also supported. On the three tasks, KAConvText achieves accuracies of 91.23% (F1 = 0.9109), 92.66% (F1 = 0.9267), and 99.82% (F1 = 0.9982), respectively—outperforming all existing baselines. Our core contribution is the first architectural integration of KA networks into NLP text classification, uniquely balancing performance gains with intrinsic model interpretability. This work establishes a novel paradigm for imbalanced text classification in low-resource languages.
📝 Abstract
This paper presents the first application of Kolmogorov-Arnold Convolution for Text (KAConvText) in sentence classification, addressing three tasks: imbalanced binary hate speech detection, balanced multiclass news classification, and imbalanced multiclass ethnic language identification. We investigate various embedding configurations, comparing random to fastText embeddings in both static and fine-tuned settings, with embedding dimensions of 100 and 300 using CBOW and Skip-gram models. Baselines include standard CNNs and CNNs augmented with a Kolmogorov-Arnold Network (CNN-KAN). In addition, we investigated KAConvText with different classification heads - MLP and KAN, where using KAN head supports enhanced interpretability. Results show that KAConvText-MLP with fine-tuned fastText embeddings achieves the best performance of 91.23% accuracy (F1-score = 0.9109) for hate speech detection, 92.66% accuracy (F1-score = 0.9267) for news classification, and 99.82% accuracy (F1-score = 0.9982) for language identification.