🤖 AI Summary
This work addresses the optimization challenges in oblique decision trees arising from discrete and non-convex splitting functions by proposing a novel formulation that recasts each oblique split as a nonlinear least-squares problem over two linear predictors. The resulting max/min envelope exhibits ReLU-like representational capacity and enables efficient node-wise optimization via a damped Newton method. Building on this, the authors develop HRT, a universal approximation framework with an explicit \(O(\delta^2)\) approximation rate, and introduce HRT-Boost—an ensemble algorithm that synergistically integrates functional gradient descent. Experimental results demonstrate that a single HRT tree achieves performance on par with state-of-the-art baselines, while HRT-Boost maintains competitive accuracy with substantially reduced model size, offering both theoretical guarantees and compactness.
📝 Abstract
Learning high-quality oblique decision trees remains a significant challenge due to the discrete and non-convex nature of split optimization. We present the Hinge Regression Tree (HRT) framework, which reframes each oblique split as a nonlinear least-squares problem over two linear predictors whose max/min envelope induces ReLU-like representation capacity. We show that the resulting node-level optimization can be interpreted as a damped Newton method, and we establish the monotonic decrease of the node objective for its backtracking line-search variant. We establish, theoretically, that HRT is a universal approximator with an explicit $O(δ^2)$ approximation rate. Building upon this base learner, we propose HRT-Boost, a mathematically synergistic ensemble extension that couples node-level Newton updates with stage-wise functional gradient descent. We show that this ensemble construction admits a stage-wise empirical risk reduction guarantee under the squared loss. Empirical evaluations on synthetic and real-world benchmarks show that HRT is highly competitive with established single-tree baselines, and HRT-Boost compares favorably with strong ensemble baselines and often yields substantially more compact models. The code is publicly available at https://github.com/Hongyi-Li-sz/HRT-Boost.