From Logits to Hierarchies: Hierarchical Clustering made Simple

📅 2024-10-10

🏛️ International Conference on Machine Learning

📈 Citations: 3

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Real-world data often exhibit hierarchical structure, yet existing deep hierarchical clustering methods suffer from poor scalability and limited performance. This paper proposes a lightweight, fine-tuning-free post-hoc framework that directly constructs high-quality hierarchical trees from logits produced by arbitrary pre-trained models—including unsupervised clusterers or ImageNet classifiers. Methodologically, it introduces two key components: (i) a logit-based spectral clustering variant, and (ii) a gradient-free hierarchical agglomerative algorithm that reconstructs similarity in feature space via logit distillation. Crucially, we provide the first theoretical and empirical evidence that logit distillation outperforms complex end-to-end hierarchical modeling. Our approach surpasses dedicated deep hierarchical clustering models across multiple benchmarks, reduces computational overhead by 10×, and—uniquely—enables通用, semantically consistent hierarchical discovery across both unsupervised and supervised settings.

Technology Category

Application Category

📝 Abstract

The structure of many real-world datasets is intrinsically hierarchical, making the modeling of such hierarchies a critical objective in both unsupervised and supervised machine learning. Recently, novel approaches for hierarchical clustering with deep architectures have been proposed. In this work, we take a critical perspective on this line of research and demonstrate that many approaches exhibit major limitations when applied to realistic datasets, partly due to their high computational complexity. In particular, we show that a lightweight procedure implemented on top of pre-trained non-hierarchical clustering models outperforms models designed specifically for hierarchical clustering. Our proposed approach is computationally efficient and applicable to any pre-trained clustering model that outputs logits, without requiring any fine-tuning. To highlight the generality of our findings, we illustrate how our method can also be applied in a supervised setup, recovering meaningful hierarchies from a pre-trained ImageNet classifier.

Problem

Research questions and friction points this paper is trying to address.

Addresses scalability and performance issues in hierarchical clustering methods

Introduces a lightweight alternative using pre-trained non-hierarchical clustering models

Extends applicability to supervised settings for recovering meaningful hierarchies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight method using pre-trained clustering models

Applies to any model outputting logits without fine-tuning

Extends to supervised settings for hierarchy recovery

🔎 Similar Papers

Interpretable Clustering: A Survey