In-Context Decision Making for Optimizing Complex AutoML Pipelines

📅 2025-08-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Modern heterogeneous ML pipelines—featuring adaptive multi-stage operations such as model selection, fine-tuning, and ensemble construction—pose new challenges for AutoML, particularly in handling high-dimensional, non-stationary, and computationally heterogeneous search spaces. To address this, we extend the classical Combined Algorithm Selection and Hyperparameter optimization (CASH) framework and propose PS-PFN: a novel method that integrates posterior sampling into the max k-armed bandit framework. PS-PFN leverages Probabilistic Functional Neural Networks (PFNs) to model reward distributions per arm and employs in-context learning for arm-specific modeling and cost-aware decision-making. The approach unifies optimization across diverse pipeline configurations while explicitly accounting for heterogeneous computational costs. Evaluated on three benchmark tasks, PS-PFN significantly outperforms state-of-the-art AutoML and bandit-based methods in both final model performance and search efficiency. Code and datasets are publicly available.

Technology Category

Application Category

📝 Abstract
Combined Algorithm Selection and Hyperparameter Optimization (CASH) has been fundamental to traditional AutoML systems. However, with the advancements of pre-trained models, modern ML workflows go beyond hyperparameter optimization and often require fine-tuning, ensembling, and other adaptation techniques. While the core challenge of identifying the best-performing model for a downstream task remains, the increasing heterogeneity of ML pipelines demands novel AutoML approaches. This work extends the CASH framework to select and adapt modern ML pipelines. We propose PS-PFN to efficiently explore and exploit adapting ML pipelines by extending Posterior Sampling (PS) to the max k-armed bandit problem setup. PS-PFN leverages prior-data fitted networks (PFNs) to efficiently estimate the posterior distribution of the maximal value via in-context learning. We show how to extend this method to consider varying costs of pulling arms and to use different PFNs to model reward distributions individually per arm. Experimental results on one novel and two existing standard benchmark tasks demonstrate the superior performance of PS-PFN compared to other bandit and AutoML strategies. We make our code and data available at https://github.com/amirbalef/CASHPlus.
Problem

Research questions and friction points this paper is trying to address.

Extends CASH framework for modern ML pipelines
Optimizes complex AutoML with fine-tuning and ensembling
Solves max k-armed bandit for pipeline selection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends CASH framework for modern ML pipelines
Uses PS-PFN with posterior sampling for bandits
Leverages PFNs for in-context learning efficiency
🔎 Similar Papers
No similar papers found.