🤖 AI Summary
Generating counterfactual explanations for tree ensemble models with provable optimality guarantees remains challenging. This work introduces the concept of a “counterfactual map,” reframing the problem as a search for generalized Voronoi cells within an equivalent hyper-rectangular partition induced by the tree structure. The authors devise a branch-and-bound algorithm leveraging a volume-based KD-tree, combined with hyper-rectangle compression and nearest-region queries, to achieve amortized global optimality. Evaluated on multiple high-stakes real-world datasets, the method delivers exact counterfactual explanations with millisecond-level latency, accelerating query speed by several orders of magnitude compared to existing exact approaches.
📝 Abstract
Counterfactual explanations are a central tool in interpretable machine learning, yet computing them exactly for complex models remains challenging. For tree ensembles, predictions are piecewise constant over a large collection of axis-aligned hyperrectangles, implying that an optimal counterfactual for a point corresponds to its projection onto the nearest rectangle with an alternative label under a chosen metric. Existing methods largely overlook this geometric structure, relying either on heuristics with no optimality guarantees or on mixed-integer programming formulations that do not scale to interactive use. In this work, we revisit counterfactual generation through the lens of nearest-region search and introduce counterfactual maps, a global representation of recourse for tree ensembles. Leveraging the fact that any tree ensemble can be compressed into an equivalent partition of labeled hyperrectangles, we cast counterfactual search as the problem of identifying the generalized Voronoi cell associated with the nearest rectangle of an alternative label. This leads to an exact, amortized algorithm based on volumetric k-dimensional (KD) trees, which performs branch-and-bound nearest-region queries with explicit optimality certificates and sublinear average query time after a one-time preprocessing phase. Our experimental analyses on several real datasets drawn from high-stakes application domains show that this approach delivers globally optimal counterfactual explanations with millisecond-level latency, achieving query times that are orders of magnitude faster than existing exact, cold-start optimization methods.