Tune My Adam, Please!

📅 2025-08-27

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

Hyperparameter tuning for the Adam optimizer is computationally expensive and exhibits poor generalization across tasks. Method: We propose a synergistic framework—Adam-PFN and CDF-augment—that jointly addresses these limitations. First, we introduce Adam-PFN, a pre-trained proxy model specifically designed for Adam hyperparameter optimization, leveraging transfer learning from TaskSet learning curves. Second, we design CDF-augment, a novel data augmentation strategy that models learning curve priors via cumulative distribution functions to enhance sample efficiency in freeze-thaw Bayesian optimization. Contribution/Results: Experiments demonstrate that our method significantly improves learning curve extrapolation accuracy and hyperparameter convergence speed under low evaluation budgets. It achieves superior robustness and generalization both in-distribution (on TaskSet) and out-of-distribution (on unseen tasks), establishing a new paradigm for efficient, transferable optimizer hyperparameter tuning.

Technology Category

Application Category

📝 Abstract

The Adam optimizer remains one of the most widely used optimizers in deep learning, and effectively tuning its hyperparameters is key to optimizing performance. However, tuning can be tedious and costly. Freeze-thaw Bayesian Optimization (BO) is a recent promising approach for low-budget hyperparameter tuning, but is limited by generic surrogates without prior knowledge of how hyperparameters affect learning. We propose Adam-PFN, a new surrogate model for Freeze-thaw BO of Adam's hyperparameters, pre-trained on learning curves from TaskSet, together with a new learning curve augmentation method, CDF-augment, which artificially increases the number of available training examples. Our approach improves both learning curve extrapolation and accelerates hyperparameter optimization on TaskSet evaluation tasks, with strong performance on out-of-distribution (OOD) tasks.

Problem

Research questions and friction points this paper is trying to address.

Tuning Adam hyperparameters effectively for deep learning performance

Reducing tedious and costly hyperparameter optimization process

Improving freeze-thaw Bayesian optimization with better surrogate models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adam-PFN surrogate model for hyperparameter tuning

CDF-augment learning curve augmentation method

Freeze-thaw Bayesian Optimization with prior knowledge

🔎 Similar Papers

People are poorly equipped to detect AI-powered voice clones