ITBoost: Information-Theoretic Trust for Robust Boosting

📅 2026-05-06
📈 Citations: 0
Influential: 0
📄 PDF

career value

209K/year
📝 Abstract
Gradient boosting remains a strong and widely used method for tabular data learning, but its performance often degrades when training labels are noisy. This behavior is largely related to the way boosting algorithms emphasize samples with large gradients, without explicitly accounting for whether such errors originate from informative hard cases or from unreliable labels. We address this issue by reconsidering how sample reliability is evaluated during boosting. Instead of relying on instantaneous error, we examine the evolution of each sample's residuals across iterations. Based on this insight, we propose Information-Theoretic Trust Boosting (ITBoost), which uses the Minimum Description Length principle to measure the complexity of residual trajectories. Samples whose residual patterns fluctuate in an irregular manner are treated as less trustworthy and are down-weighted during learning. Theoretically, we derive a tighter generalization bound for ITBoost under label noise. Empirical results on various tabular benchmarks indicate that ITBoost provides improved robustness in noisy environments over leading boosting and deep tabular models, while retaining best average performance on clean data.
Problem

Research questions and friction points this paper is trying to address.

gradient boosting
label noise
sample reliability
tabular data
robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Information-Theoretic Trust
Minimum Description Length
Gradient Boosting
Label Noise Robustness
Residual Trajectory
Y
Ye Su
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
L
Longlong Zhao
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
D
Diego Garcia-Gil
Department of Software Engineering, Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), University of Granada
J
Jipeng Guo
College of Information Science and Technology, Beijing University of Chemical Technology
G
Gangchun Zhang
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
J
Jinxin Chen
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
Jinsong Chen
Jinsong Chen
Central China Normal University
Graph Representation LearningGraph Data MiningAI for Education