MDL-GBG: A Non-parametric and Interpretable Granular-Ball Generation Method for Clustering

📅 2026-05-09
📈 Citations: 0
Influential: 0
📄 PDF

career value

231K/year
🤖 AI Summary
This work proposes a novel granular ball generation method grounded in the Minimum Description Length (MDL) principle, addressing the lack of theoretical foundation and transparency in traditional approaches that rely on handcrafted quality metrics and heuristic rules. By reframing granular ball construction as a local model selection problem, the method automatically determines whether to retain, split, or peel a boundary by comparing the description lengths of three candidate models: a single ball, two balls, and a core ball with residual points. A residual reallocation mechanism further refines boundary sample handling. The approach requires no preset parameters and offers both interpretability and theoretical coherence. Experimental results across 20 UCI datasets demonstrate that MDL-GBG+AC achieves the best average performance in terms of Adjusted Rand Index (ARI), Accuracy (ACC), and Normalized Mutual Information (NMI), significantly outperforming existing heuristic methods.
📝 Abstract
Existing granular-ball generation methods are still mainly driven by handcrafted quality measures and heuristic splitting or stopping criteria, which weakens the transparency of local generation decisions in clustering. To address this issue, this paper proposes Minimum Description Length based Granular-Ball Generation (MDL-GBG), a non-parametric and interpretable granular-ball generation method for clustering. MDL-GBG reformulates granular-ball generation as a local model selection problem under the Minimum Description Length principle. For each granular ball, three candidate explanations are compared, namely a single-ball model, a two-ball model, and a core-ball-plus-residual model, and the model with the shortest description length is selected. In this way, ball retention, splitting, and residual peeling are unified within a common coding-theoretic framework. A residual reassignment mechanism is further introduced to globally re-evaluate peeled-off boundary samples after stable granular-balls are formed. Experiments on 20 UCI datasets show that the stable granular-balls generated by MDL-GBG provide a highly competitive upstream representation for clustering, with MDL-GBG+AC achieving the best overall average ranks in ARI, ACC, and NMI among the compared methods. These results demonstrate that MDL-GBG offers an effective and interpretable alternative to conventional heuristic granular-ball generation strategies.
Problem

Research questions and friction points this paper is trying to address.

granular-ball generation
clustering
interpretability
non-parametric
Minimum Description Length
Innovation

Methods, ideas, or system contributions that make the work stand out.

Minimum Description Length
Granular-Ball Generation
Non-parametric Clustering
Interpretable AI
Model Selection
🔎 Similar Papers
2024-09-01arXiv.orgCitations: 4
Z
Zeqiang Xian
Department of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, Jiangxi, China; Key Laboratory of Data Science and Artificial Intelligence of Jiangxi Education Institutes, Gannan Normal University, Ganzhou 341000, Jiangxi, China
C
Caihui Liu
Department of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, Jiangxi, China; Key Laboratory of Data Science and Artificial Intelligence of Jiangxi Education Institutes, Gannan Normal University, Ganzhou 341000, Jiangxi, China
Y
Yong Zhang
Department of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, Jiangxi, China; Key Laboratory of Data Science and Artificial Intelligence of Jiangxi Education Institutes, Gannan Normal University, Ganzhou 341000, Jiangxi, China
W
Wenjing Qiu
Department of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, Jiangxi, China; Key Laboratory of Data Science and Artificial Intelligence of Jiangxi Education Institutes, Gannan Normal University, Ganzhou 341000, Jiangxi, China
D
Duoqian Miao
Department of Computer Science and Technology, Tongji University, 201804 Shanghai, China
Witold Pedrycz
Witold Pedrycz
Unknown affiliation