PINE: Pruning Boosted Tree Ensembles with Conformal In-Distribution Prediction Equivalence

📅 2026-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes PINE, a novel tree ensemble pruning method that introduces conformal prediction to preserve prediction equivalence within a distributionally defined in-distribution region parameterized by α. Unlike conventional pruning approaches that trade prediction consistency for compression gains, or existing fidelity-preserving methods that maintain global prediction equivalence at the cost of limited compression efficiency, PINE leverages conformal calibration to construct in-distribution regions, designs a fidelity-aware pruning algorithm, and incorporates an equivalence verification mechanism. This enables substantial improvements in compression rates while strictly preserving predictions in critical regions. Experiments on twelve public tabular datasets demonstrate that PINE achieves up to 30% higher compression ratios compared to current fidelity-preserving pruning techniques, without compromising predictive fidelity.
📝 Abstract
Tree ensembles are machine learning models with strong predictive performance and interpretability, and remain widely used for tabular data. Standard pruning methods for tree ensembles typically optimize an accuracy-compression trade-off and may change a subset of predictions, potentially compromising decision consistency. Faithful pruning methods address this issue by preserving prediction equivalence over the entire input space, but this requirement leads to lower compression ratios. We propose PINE, a pruning method that provides strong guarantees within an in-distribution region. PINE preserves prediction equivalence within this region and controls the region size using a single parameter $α$ via conformal calibration. Experiments on 12 public tabular datasets show that PINE improves the compression ratio by up to $30\%$ while preserving predictions at a comparable level to existing faithful pruning methods.
Problem

Research questions and friction points this paper is trying to address.

pruning
tree ensembles
prediction equivalence
compression ratio
conformal prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

pruning
tree ensembles
conformal prediction
in-distribution equivalence
compression ratio