🤖 AI Summary
This work addresses the lack of efficient and reliable statistical inference methods in existing Explainable Boosting Machines (EBMs), which typically rely on computationally expensive bootstrapping to assess feature importance. The authors introduce Boulevard regularization, reformulating the gradient boosting procedure as a per-feature kernel ridge regression. This approach provides, for the first time, end-to-end asymptotic normality and theoretical guarantees for EBMs, enabling the construction of confidence and prediction intervals that do not depend on sample size. The method avoids the curse of dimensionality and achieves the minimax optimal mean squared error rate of \(O(pn^{-2/3})\) under Lipschitz generalized additive models, substantially enhancing both interpretability and statistical reliability.
📝 Abstract
Explainable boosting machines (EBMs) are popular"glass-box"models that learn a set of univariate functions using boosting trees. These achieve explainability through visualizations of each feature's effect. However, unlike linear model coefficients, uncertainty quantification for the learned univariate functions requires computationally intensive bootstrapping, making it hard to know which features truly matter. We provide an alternative using recent advances in statistical inference for gradient boosting, deriving methods for statistical inference as well as end-to-end theoretical guarantees. Using a moving average instead of a sum of trees (Boulevard regularization) allows the boosting process to converge to a feature-wise kernel ridge regression. This produces asymptotically normal predictions that achieve the minimax-optimal mean squared error for fitting Lipschitz GAMs with $p$ features at rate $O(pn^{-2/3})$, successfully avoiding the curse of dimensionality. We then construct prediction intervals for the response and confidence intervals for each learned univariate function with a runtime independent of the number of datapoints, enabling further explainability within EBMs.