RS-ORT: A Reduced-Space Branch-and-Bound Algorithm for Optimal Regression Trees

๐Ÿ“… 2025-10-27
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing mixed-integer programming (MIP)-based regression tree learning methods struggle to jointly model continuous features and scale to large datasets. Method: This paper proposes a two-stage MIP optimization framework. Its core innovations are: (1) restricting branch-and-bound exclusively to tree-structure variables, yielding sample-size-independent convergence guarantees; (2) incorporating closed-form leaf predictions, empirical threshold discretization, and exact analytical solutions for depth-1 subtrees to tighten bounds; and (3) integrating decomposition-based upper/lower bound estimation with node-level parallelization for efficient training on million-scale datasets. Results: Experiments on multi-source benchmark datasets with mixed feature types demonstrate substantial improvements over state-of-the-art MIP baselines. The method constructs high-quality regression trees on 2 million samples within four hoursโ€”achieving both provable optimality and practical scalability.

Technology Category

Application Category

๐Ÿ“ Abstract
Mixed-integer programming (MIP) has emerged as a powerful framework for learning optimal decision trees. Yet, existing MIP approaches for regression tasks are either limited to purely binary features or become computationally intractable when continuous, large-scale data are involved. Naively binarizing continuous features sacrifices global optimality and often yields needlessly deep trees. We recast the optimal regression-tree training as a two-stage optimization problem and propose Reduced-Space Optimal Regression Trees (RS-ORT) - a specialized branch-and-bound (BB) algorithm that branches exclusively on tree-structural variables. This design guarantees the algorithm's convergence and its independence from the number of training samples. Leveraging the model's structure, we introduce several bound tightening techniques - closed-form leaf prediction, empirical threshold discretization, and exact depth-1 subtree parsing - that combine with decomposable upper and lower bounding strategies to accelerate the training. The BB node-wise decomposition enables trivial parallel execution, further alleviating the computational intractability even for million-size datasets. Based on the empirical studies on several regression benchmarks containing both binary and continuous features, RS-ORT also delivers superior training and testing performance than state-of-the-art methods. Notably, on datasets with up to 2,000,000 samples with continuous features, RS-ORT can obtain guaranteed training performance with a simpler tree structure and a better generalization ability in four hours.
Problem

Research questions and friction points this paper is trying to address.

Solving computational intractability in optimal regression tree training
Handling both binary and continuous features without sacrificing optimality
Enabling efficient training on large-scale datasets with millions of samples
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage optimization for optimal regression trees
Branch-and-bound algorithm on tree-structural variables
Parallel decomposition for large-scale datasets
C
Cristobal Heredia
Department of Industrial and Management Systems Engineering, University of South Florida, Tampa, FL, USA
P
Pedro Chumpitaz-Flores
Department of Industrial and Management Systems Engineering, University of South Florida, Tampa, FL, USA
Kaixun Hua
Kaixun Hua
Assistant Professor, University of South Florida
Trustworthy AIClusteringGlobal Optimization