Boosting Revisited: Benchmarking and Advancing LP-Based Ensemble Methods

📅 2025-07-24

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This paper systematically evaluates six linear programming (LP)-based fully-corrective boosting methods—including two newly proposed approaches, NM-Boost and QRLP-Boost—against mainstream heuristic gradient boosting frameworks (e.g., XGBoost, LightGBM) across 20 benchmark datasets. Addressing limitations in accuracy, ensemble sparsity, margin distribution, real-time inference efficiency, and hyperparameter robustness, the study provides the first large-scale empirical evidence that LP-based boosters achieve accuracy competitive with state-of-the-art gradient boosting when using shallow decision trees as base learners, while drastically reducing ensemble size (sparsity improved by multiple-fold). Crucially, these methods enable lossless sparsification of pre-trained ensembles. The results demonstrate that global optimization—central to LP-based boosting—enhances both robustness and interpretability, revealing a promising new paradigm for trustworthy ensemble learning.

Technology Category

Application Category

📝 Abstract

Despite their theoretical appeal, totally corrective boosting methods based on linear programming have received limited empirical attention. In this paper, we conduct the first large-scale experimental study of six LP-based boosting formulations, including two novel methods, NM-Boost and QRLP-Boost, across 20 diverse datasets. We evaluate the use of both heuristic and optimal base learners within these formulations, and analyze not only accuracy, but also ensemble sparsity, margin distribution, anytime performance, and hyperparameter sensitivity. We show that totally corrective methods can outperform or match state-of-the-art heuristics like XGBoost and LightGBM when using shallow trees, while producing significantly sparser ensembles. We further show that these methods can thin pre-trained ensembles without sacrificing performance, and we highlight both the strengths and limitations of using optimal decision trees in this context.

Problem

Research questions and friction points this paper is trying to address.

Evaluating LP-based boosting methods empirically

Comparing novel methods with state-of-the-art heuristics

Analyzing ensemble sparsity and performance trade-offs

Innovation

Methods, ideas, or system contributions that make the work stand out.

LP-based boosting with novel NM-Boost and QRLP-Boost

Evaluates heuristic and optimal base learners

Produces sparser ensembles with shallow trees

🔎 Similar Papers

Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique