🤖 AI Summary
This work addresses a fundamental challenge in interpretable clustering: whether the worst-case interpretability cost bound can be surpassed—and the underlying cluster structure reliably recovered—when data exhibit well-separated clusters. To this end, we propose a decision-tree-based mixture-model clustering method. We introduce the first theoretical framework of “interpretability–noise ratio,” enabling data-agnostic, efficient tree construction. Under sub-Gaussian assumptions, we derive tight upper and lower bounds on estimation error. Furthermore, we pioneer the integration of Concept Activation Vectors (CAVs) into unsupervised clustering, facilitating interpretable cluster identification in deep representation spaces. Experiments on standard tabular and image benchmarks demonstrate that our method significantly enhances interpretability while maintaining high clustering accuracy; moreover, its theoretical guarantees strictly improve upon those of existing distribution-agnostic approaches.
📝 Abstract
Decision Trees are one of the backbones of explainable machine learning, and often serve as interpretable alternatives to black-box models. Traditionally utilized in the supervised setting, there has recently also been a surge of interest in decision trees for unsupervised learning. While several works with worst-case guarantees on the clustering cost have appeared, these results are distribution-agnostic, and do not give insight into when decision trees can actually recover the underlying distribution of the data (up to some small error). In this paper, we therefore introduce the notion of an explainability-to-noise ratio for mixture models, formalizing the intuition that well-clustered data can indeed be explained well using a decision tree. We propose an algorithm that takes as input a mixture model and constructs a suitable tree in data-independent time. Assuming sub-Gaussianity of the mixture components, we prove upper and lower bounds on the error rate of the resulting decision tree. In addition, we demonstrate how concept activation vectors can be used to extend explainable clustering to neural networks. We empirically demonstrate the efficacy of our approach on standard tabular and image datasets.