🤖 AI Summary
This work proposes a novel granular-ball classification method that addresses the limitations of traditional approaches, which rely on handcrafted metrics and heuristic rules and struggle to transparently model boundary-sensitive regions. For the first time, the Minimum Description Length (MDL) principle is introduced into granular-ball construction, framing it as a local model selection problem. By comparing the description lengths of three candidate models—single-ball, double-ball, and core-boundary—the method adaptively determines whether to retain, split, or refine each granular ball. Classification predictions are then made using class-level mixed encoding. The resulting approach achieves boundary-aware, non-parametric, and interpretable granular-ball classification. Experimental results on 18 benchmark datasets demonstrate that it consistently outperforms both classical classifiers and existing granular-ball methods in terms of average accuracy, Macro-F1 score, and average ranking.
📝 Abstract
Existing granular-ball classification methods are often driven by handcrafted quality measures, neighborhood rules, or heuristic splitting and stopping criteria, which may reduce the transparency of local construction decisions and hinder explicit modeling of boundary-sensitive regions. To address this issue, this paper proposes a Minimum Description Length based Granular-Ball Classifier (MDL-GBC), a boundary-aware non-parametric and interpretable granular-ball classifier. MDL-GBC formulates class-conditional granular-ball construction as a local model selection problem under the Minimum Description Length principle. For each class, samples from the target class provide positive class evidence, while samples from the remaining classes provide negative boundary evidence. For each current granular ball, three candidate explanations are compared under a unified description-length criterion: a single-ball model, a two-ball model, and a core-boundary model. The selected model determines whether the ball is retained, geometrically split, or refined into core and boundary-sensitive child balls, thereby making local construction decisions consistent with the MDL-based classification mechanism. During prediction, a class-level mixture coding rule aggregates stable granular balls of the same class and assigns the test sample by comparing class-wise coding costs. Experiments on 18 benchmark datasets show that MDL-GBC achieves competitive classification performance against classical classifiers and representative granular-ball-based methods, obtaining the best average Accuracy, Macro-F1, and average rank. These results indicate that MDL-GBC provides an effective and interpretable alternative to conventional heuristic granular-ball classification strategies.