A Generic Complete Anytime Beam Search for Optimal Decision Tree

📅 2025-08-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Optimal decision tree learning is NP-hard; while existing exact algorithms guarantee global optimality, they exhibit poor anytime performance—i.e., they struggle to rapidly produce high-quality solutions within time limits. This paper proposes CA-DL8.5, a novel anytime algorithm that unifies and generalizes existing anytime strategies (e.g., Limited Discrepancy Search and Top-k search) within a modular framework, enabling flexible integration of heuristics and relaxation mechanisms. Built upon the DL8.5 paradigm, CA-DL8.5 incorporates branch-and-bound pruning, trie-based caching, and restart-based beam search, and progressively relaxes pruning conditions to accelerate solution quality improvement over time. Experiments on standard classification benchmarks demonstrate that CA-DL8.5 with LDS heuristics significantly outperforms Blossom and other variants: it achieves state-of-the-art anytime performance while retaining convergence guarantees to the globally optimal decision tree.

Technology Category

Application Category

📝 Abstract
Finding an optimal decision tree that minimizes classification error is known to be NP-hard. While exact algorithms based on MILP, CP, SAT, or dynamic programming guarantee optimality, they often suffer from poor anytime behavior -- meaning they struggle to find high-quality decision trees quickly when the search is stopped before completion -- due to unbalanced search space exploration. To address this, several anytime extensions of exact methods have been proposed, such as LDS-DL8.5, Top-k-DL8.5, and Blossom, but they have not been systematically compared, making it difficult to assess their relative effectiveness. In this paper, we propose CA-DL8.5, a generic, complete, and anytime beam search algorithm that extends the DL8.5 framework and unifies some existing anytime strategies. In particular, CA-DL8.5 generalizes previous approaches LDS-DL8.5 and Top-k-DL8.5, by allowing the integration of various heuristics and relaxation mechanisms through a modular design. The algorithm reuses DL8.5's efficient branch-and-bound pruning and trie-based caching, combined with a restart-based beam search that gradually relaxes pruning criteria to improve solution quality over time. Our contributions are twofold: (1) We introduce this new generic framework for exact and anytime decision tree learning, enabling the incorporation of diverse heuristics and search strategies; (2) We conduct a rigorous empirical comparison of several instantiations of CA-DL8.5 -- based on Purity, Gain, Discrepancy, and Top-k heuristics -- using an anytime evaluation metric called the primal gap integral. Experimental results on standard classification benchmarks show that CA-DL8.5 using LDS (limited discrepancy) consistently provides the best anytime performance, outperforming both other CA-DL8.5 variants and the Blossom algorithm while maintaining completeness and optimality guarantees.
Problem

Research questions and friction points this paper is trying to address.

Addresses poor anytime behavior in exact optimal decision tree algorithms
Proposes a generic complete anytime beam search framework CA-DL8.5
Systematically compares anytime strategies using primal gap metric
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generic complete anytime beam search algorithm
Extends DL8.5 framework with modular heuristics integration
Combines branch-and-bound pruning with restart-based beam search
🔎 Similar Papers
No similar papers found.