Default Machine Learning Hyperparameters Do Not Provide Informative Initialization for Bayesian Optimization

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This study investigates whether default hyperparameters in machine learning libraries serve as effective initial points for Bayesian optimization to accelerate convergence. The authors conduct the first large-scale empirical evaluation by initializing optimization with samples drawn from a truncated Gaussian distribution centered around default values and comparing this strategy against uniform random initialization. Experiments span three optimization frameworks—BoTorch, Optuna, and Scikit-Optimize—combined with Random Forest, SVM, and MLP models across five standard datasets. Results show that default hyperparameters do not yield statistically significant performance improvements (p = 0.141–0.908), and any early advantage they confer dissipates as optimization progresses. These findings suggest that default values lack informative prior knowledge, challenging the common heuristic of using them as starting points in hyperparameter optimization.

Technology Category

Application Category

📝 Abstract

Bayesian Optimization (BO) is a standard tool for hyperparameter tuning thanks to its sample efficiency on expensive black-box functions. While most BO pipelines begin with uniform random initialization, default hyperparameter values shipped with popular ML libraries such as scikit-learn encode implicit expert knowledge and could serve as informative starting points that accelerate convergence. This hypothesis, despite its intuitive appeal, has remained largely unexamined. We formalize the idea by initializing BO with points drawn from truncated Gaussian distributions centered at library defaults and compare the resulting trajectories against a uniform-random baseline. We conduct an extensive empirical evaluation spanning three BO back-ends (BoTorch, Optuna, Scikit-Optimize), three model families (Random Forests, Support Vector Machines, Multilayer Perceptrons), and five benchmark datasets covering classification and regression tasks. Performance is assessed through convergence speed and final predictive quality, and statistical significance is determined via one-sided binomial tests. Across all conditions, default-informed initialization yields no statistically significant advantage over purely random sampling, with p-values ranging from 0.141 to 0.908. A sensitivity analysis on the prior variance confirms that, while tighter concentration around the defaults improves early evaluations, this transient benefit vanishes as optimization progresses, leaving final performance unchanged. Our results provide no evidence that default hyperparameters encode useful directional information for optimization. We therefore recommend that practitioners treat hyperparameter tuning as an integral part of model development and favor principled, data-driven search strategies over heuristic reliance on library defaults.

Problem

Research questions and friction points this paper is trying to address.

Bayesian Optimization

hyperparameter tuning

default hyperparameters

initialization

machine learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian Optimization

hyperparameter tuning

default hyperparameters