Rethinking Divisive Hierarchical Clustering from a Distributional Perspective

📅 2026-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing divisive hierarchical clustering methods often produce dendrograms suffering from improper splits, failure to effectively merge similar clusters, and inconsistency with ground-truth labels. This work reframes the problem from a distributional perspective and proposes replacing traditional set-oriented bipartition criteria with a distributional kernel. By optimizing the total pairwise similarity across all clusters, the method constructs dendrograms with theoretical lower-bound guarantees, thereby overcoming structural limitations inherent in conventional approaches. Extensive experiments on both synthetic and spatial transcriptomics data demonstrate that the resulting dendrograms significantly outperform those generated by existing methods and exhibit strong alignment with biologically meaningful regions.

Technology Category

Application Category

📝 Abstract
We uncover that current objective-based Divisive Hierarchical Clustering (DHC) methods produce a dendrogram that does not have three desired properties i.e., no unwarranted splitting, group similar clusters into a same subset, ground-truth correspondence. This shortcoming has their root cause in using a set-oriented bisecting assessment criterion. We show that this shortcoming can be addressed by using a distributional kernel, instead of the set-oriented criterion; and the resultant clusters achieve a new distribution-oriented objective to maximize the total similarity of all clusters (TSC). Our theoretical analysis shows that the resultant dendrogram guarantees a lower bound of TSC. The empirical evaluation shows the effectiveness of our proposed method on artificial and Spatial Transcriptomics (bioinformatics) datasets. Our proposed method successfully creates a dendrogram that is consistent with the biological regions in a Spatial Transcriptomics dataset, whereas other contenders fail.
Problem

Research questions and friction points this paper is trying to address.

Divisive Hierarchical Clustering
dendrogram
clustering objective
distributional perspective
Spatial Transcriptomics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Divisive Hierarchical Clustering
distributional kernel
Total Similarity of Clusters
Spatial Transcriptomics
dendrogram consistency
🔎 Similar Papers
No similar papers found.