LoBoost: Fast Model-Native Local Conformal Prediction for Gradient-Boosted Trees

📅 2026-02-25

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

This work addresses the challenge of adaptively quantifying uncertainty under heteroscedasticity in tabular regression with gradient-boosted trees, where existing conformal prediction methods often rely on auxiliary models or data splitting at the cost of efficiency. The authors propose LoBoost, a model-native local conformal prediction approach that innovatively reuses prefix paths of leaf nodes in gradient-boosted trees to construct multi-scale calibration groups—eliminating the need for retraining or auxiliary models. By encoding leaf node sequences, performing multi-scale prefix matching, and applying local residual quantile calibration, LoBoost achieves efficient and adaptive uncertainty quantification using only a standard train/calibration split. Experiments demonstrate that LoBoost yields high-quality prediction intervals across multiple datasets, frequently reduces test MSE, and substantially accelerates the calibration process.

Technology Category

Application Category

📝 Abstract

Gradient-boosted decision trees are among the strongest off-the-shelf predictors for tabular regression, but point predictions alone do not quantify uncertainty. Conformal prediction provides distribution-free marginal coverage, yet split conformal uses a single global residual quantile and can be poorly adaptive under heteroscedasticity. Methods that improve adaptivity typically fit auxiliary nuisance models or introduce additional data splits/partitions to learn the conformal score, increasing cost and reducing data efficiency. We propose LoBoost, a model-native local conformal method that reuses the fitted ensemble's leaf structure to define multiscale calibration groups. Each input is encoded by its sequence of visited leaves; at resolution level k, we group points by matching prefixes of leaf indices across the first k trees and calibrate residual quantiles within each group. LoBoost requires no retraining, auxiliary models, or extra splitting beyond the standard train/calibration split. Experiments show competitive interval quality, improved test MSE on most datasets, and large calibration speedups.

Problem

Research questions and friction points this paper is trying to address.

conformal prediction

gradient-boosted trees

heteroscedasticity

local calibration

uncertainty quantification

Innovation

Methods, ideas, or system contributions that make the work stand out.

local conformal prediction

gradient-boosted trees

model-native calibration