ExDBSCAN: Explaining DBSCAN with Counterfactual Reasoning -- Additional Material

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses the limited interpretability of DBSCAN clustering, which offers no clear rationale for why a data point is classified as a core point or an outlier, nor does it characterize its robustness to minor perturbations. To bridge this gap, the authors propose ExDBSCAN—the first counterfactual explanation method for DBSCAN with theoretical validity guarantees. By constructing a density-connected weighted graph and integrating a physics-inspired repulsion–attraction mechanism, ExDBSCAN enhances explanation diversity while preserving proximity to the original instance. Comprehensive experiments across 30 tabular datasets demonstrate that ExDBSCAN achieves 100% explanation validity, substantially outperforming four baseline methods and effectively filling a critical void in the interpretability of unsupervised clustering algorithms.

📝 Abstract

Clustering is an unsupervised technique for grouping data points by similarity. While explainability methods exist for supervised machine learning, they are not directly applicable to clustering, making it challenging to understand cluster assignments. This interpretability gap is particularly evident in the popular density-based method DBSCAN, which assigns points as inliers (cluster members in dense regions) or outliers (noise points in sparse regions). DBSCAN does not provide insight into why a particular point receives its assignment or whether its assignment is robust to small changes in the data. To address the lack of explainability, we introduce ExDBSCAN, a density-aware, post-hoc explanation method. ExDBSCAN offers actionable counterfactual explanations, with theoretical guarantees for validity. It generates multiple counterfactuals using a density connected weighted graph, adopting a physics-inspired model that repels counterfactual candidates from one another (diversity), while pulling them toward the instance to explain (proximity). Empirical evaluation on 30 tabular datasets comparing against four baselines shows that ExDBSCAN outperforms all baselines while attaining perfect validity and retrieving diverse, proximal counterfactuals.

Problem

Research questions and friction points this paper is trying to address.

clustering

explainability

DBSCAN

counterfactual reasoning

interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

counterfactual reasoning

density-based clustering

explainable AI