AutoML Benchmark with shorter time constraints and early stopping

📅 2025-04-01

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

Existing AutoML benchmarks (e.g., AMLB) support only coarse-grained time budgets (e.g., 1h/4h), limiting applicability to resource-constrained, high-frequency retraining scenarios. This work systematically redesigns AMLB by introducing sub-hour time constraints (5–45 minutes) and configurable early stopping. We conduct a large-scale empirical evaluation across 104 standardized tabular tasks using 11 state-of-the-art AutoML frameworks. Our findings are threefold: (1) Framework relative rankings remain largely stable under short time budgets, confirming robustness in latency-critical settings; (2) Early stopping significantly increases performance variance—uncovering a novel accuracy–stability trade-off intrinsic to lightweight AutoML; (3) The enhanced benchmark improves practicality and accessibility, providing standardized evaluation infrastructure for edge computing, real-time modeling, and other latency-sensitive applications.

Technology Category

Application Category

📝 Abstract

Automated Machine Learning (AutoML) automatically builds machine learning (ML) models on data. The de facto standard for evaluating new AutoML frameworks for tabular data is the AutoML Benchmark (AMLB). AMLB proposed to evaluate AutoML frameworks using 1- and 4-hour time budgets across 104 tasks. We argue that shorter time constraints should be considered for the benchmark because of their practical value, such as when models need to be retrained with high frequency, and to make AMLB more accessible. This work considers two ways in which to reduce the overall computation used in the benchmark: smaller time constraints and the use of early stopping. We conduct evaluations of 11 AutoML frameworks on 104 tasks with different time constraints and find the relative ranking of AutoML frameworks is fairly consistent across time constraints, but that using early-stopping leads to a greater variety in model performance.

Problem

Research questions and friction points this paper is trying to address.

Evaluating AutoML frameworks with shorter time constraints

Assessing impact of early stopping on model performance

Comparing AutoML rankings across varying time budgets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Shorter time constraints for AutoML Benchmark

Early stopping to reduce computation

Consistent ranking across time constraints

🔎 Similar Papers

No similar papers found.