ODTlearn: A Package for Learning Optimal Decision Trees for Prediction and Prescription

📅 2023-07-28
🏛️ arXiv.org
📈 Citations: 2
Influential: 1
📄 PDF
🤖 AI Summary
To address the insufficient interpretability, fairness, and robustness of traditional decision trees in high-stakes prediction and prescriptive decision-making, this paper proposes a unified modeling framework based on mixed-integer optimization (MIO) and releases ODTlearn, an open-source Python package. The framework systematically integrates four classes of optimal decision trees—classification, fair classification, distributionally robust classification, and observational-data-driven prescriptive trees—enabling multi-objective trade-offs and constraint-based modeling. Designed with object-oriented principles, it supports commercial (e.g., Gurobi) and open-source (e.g., COIN-OR CBC) solvers, balancing computational efficiency and scalability. Comprehensive documentation, tutorials, and fully reproducible code are provided. Empirical evaluations demonstrate that the approach preserves strong interpretability while significantly improving fairness, out-of-distribution generalization, and individualized prescription quality in high-risk settings.
📝 Abstract
ODTLearn is an open-source Python package that provides methods for learning optimal decision trees for high-stakes predictive and prescriptive tasks based on the mixed-integer optimization (MIO) framework proposed in Aghaei et al. (2019) and several of its extensions. The current version of the package provides implementations for learning optimal classification trees, optimal fair classification trees, optimal classification trees robust to distribution shifts, and optimal prescriptive trees from observational data. We have designed the package to be easy to maintain and extend as new optimal decision tree problem classes, reformulation strategies, and solution algorithms are introduced. To this end, the package follows object-oriented design principles and supports both commercial (Gurobi) and open source (COIN-OR branch and cut) solvers. The package documentation and an extensive user guide can be found at https://d3m-research-group.github.io/odtlearn/. Additionally, users can view the package source code and submit feature requests and bug reports by visiting https://github.com/D3M-Research-Group/odtlearn.
Problem

Research questions and friction points this paper is trying to address.

Learning optimal decision trees for prediction tasks
Developing optimal fair classification trees robustly
Creating optimal prescriptive trees from observational data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses mixed-integer optimization framework
Implements optimal classification and prescriptive trees
Supports both commercial and open-source solvers
Patrick Vossler
Patrick Vossler
University of California, San Francisco
StatisticsCausal InferenceMachine LearningHigh-dimensional statisticsfeature selection
S
S. Aghaei
University of Southern California, Center for AI in Society, Los Angeles, CA 90089
Nathan Justin
Nathan Justin
PhD Candidate, University of Southern California
OptimizationMachine LearningOperations Research
Nathanael Jo
Nathanael Jo
Massachusetts Institute of Technology
A
Andr'es G'omez
University of Southern California, Center for AI in Society, Los Angeles, CA 90089
P
P. Vayanos
University of Southern California, Center for AI in Society, Los Angeles, CA 90089