Optimal Decision Tree Pruning Revisited: Algorithms and Complexity

📅 2025-03-05

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This paper systematically investigates the computational complexity of two fundamental operations in decision tree pruning: subtree replacement and subtree raising. We prove that subtree replacement admits an optimal polynomial-time algorithm, whereas subtree raising is NP-complete. We establish the first precise parameterized complexity dichotomy based on domain size $D$ and number of features $d$: the problem is fixed-parameter tractable (FPT) when both parameters are bounded, with an algorithm running in $D^{2d} cdot |I|^{O(1)}$ time; moreover, it becomes W[1]-hard if either $D$ or $d$ is unbounded. Our analysis integrates parameterized complexity theory, classical reductions, and constructive algorithm design, and we empirically validate the theoretical findings. These results provide a rigorous complexity-theoretic foundation and practical algorithmic guarantees for efficient, interpretable pruning of machine learning models.

Technology Category

Application Category

📝 Abstract

We present a comprehensive classical and parameterized complexity analysis of decision tree pruning operations, extending recent research on the complexity of learning small decision trees. Thereby, we offer new insights into the computational challenges of decision tree simplification, a crucial aspect of developing interpretable and efficient machine learning models. We focus on fundamental pruning operations of subtree replacement and raising, which are used in heuristics. Surprisingly, while optimal pruning can be performed in polynomial time for subtree replacement, the problem is NP-complete for subtree raising. Therefore, we identify parameters and combinations thereof that lead to fixed-parameter tractability or hardness, establishing a precise borderline between these complexity classes. For example, while subtree raising is hard for small domain size $D$ or number $d$ of features, it can be solved in $D^{2d} cdot |I|^{O(1)}$ time, where $|I|$ is the input size. We complement our theoretical findings with preliminary experimental results, demonstrating the practical implications of our analysis.

Problem

Research questions and friction points this paper is trying to address.

Analyzes complexity of decision tree pruning operations

Identifies computational challenges in decision tree simplification

Determines tractability and hardness for subtree replacement and raising

Innovation

Methods, ideas, or system contributions that make the work stand out.

Polynomial-time optimal pruning for subtree replacement

NP-complete complexity for subtree raising operations

Fixed-parameter tractability analysis for decision tree pruning

🔎 Similar Papers

Learning accurate and interpretable tree-based models