Hyperparameter Optimization in Machine Learning

📅 2024-10-30

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 1

career value

212K/year

🤖 AI Summary

This paper addresses the inefficiency and lack of scalability of manual hyperparameter tuning in large-scale machine learning. It systematically surveys hyperparameter optimization (HPO), unifying and classifying five mainstream paradigms: random/low-discrepancy search, bandit-based methods, Bayesian optimization, population-based (evolutionary) algorithms, and gradient-based differentiable optimization. The survey further extends to emerging settings—including online HPO, constrained HPO, and multi-objective HPO. Crucially, the work establishes novel theoretical connections between HPO and meta-learning as well as neural architecture search, yielding a comprehensive knowledge framework that articulates methodological principles, applicability boundaries, and inherent limitations. By clarifying the technical evolution and identifying key open challenges, this study provides a theoretically grounded yet practically actionable foundation for automated machine learning.

Technology Category

Application Category

📝 Abstract

Hyperparameters are configuration variables controlling the behavior of machine learning algorithms. They are ubiquitous in machine learning and artificial intelligence and the choice of their values determines the effectiveness of systems based on these technologies. Manual hyperparameter search is often unsatisfactory and becomes infeasible when the number of hyperparameters is large. Automating the search is an important step towards advancing, streamlining, and systematizing machine learning, freeing researchers and practitioners alike from the burden of finding a good set of hyperparameters by trial and error. In this survey, we present a unified treatment of hyperparameter optimization, providing the reader with examples, insights into the state-of-the-art, and numerous links to further reading. We cover the main families of techniques to automate hyperparameter search, often referred to as hyperparameter optimization or tuning, including random and quasi-random search, bandit-, model-, population-, and gradient-based approaches. We further discuss extensions, including online, constrained, and multi-objective formulations, touch upon connections with other fields such as meta-learning and neural architecture search, and conclude with open questions and future research directions.

Problem

Research questions and friction points this paper is trying to address.

Automating hyperparameter search to improve machine learning efficiency

Comparing state-of-the-art hyperparameter optimization techniques and methods

Addressing challenges in online, constrained, and multi-objective hyperparameter tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automating hyperparameter search in ML

Covering diverse optimization techniques

Exploring extensions and future directions

🔎 Similar Papers

No similar papers found.