🤖 AI Summary
This work addresses the sample complexity of stochastic convex optimization when key parameters—such as the distance from initialization to the optimal solution—are unknown. We propose a robust model selection and adaptive regularization framework that achieves *perfect adaptivity* to the unknown optimal distance, automatically tuning both the learning rate and regularization strength without any prior knowledge. Theoretically, our method attains the optimal sample complexity bound (up to a double-logarithmic factor) known for settings with oracle access to the parameter, and reveals a fundamental separation between sample and computational complexity in the parameter-free regime. Moreover, it supports joint adaptivity across multiple structural assumptions (e.g., smoothness, strong convexity). Empirical evaluation demonstrates significant mitigation of overfitting on small validation sets: notably in few-shot fine-tuning of CLIP on CIFAR-10 and in prompt engineering for shape counting with Gemini.
📝 Abstract
We study the sample complexity of stochastic convex optimization when problem parameters, e.g., the distance to optimality, are unknown. We pursue two strategies. First, we develop a reliable model selection method that avoids overfitting the validation set. This method allows us to generically tune the learning rate of stochastic optimization methods to match the optimal known-parameter sample complexity up to $loglog$ factors. Second, we develop a regularization-based method that is specialized to the case that only the distance to optimality is unknown. This method provides perfect adaptability to unknown distance to optimality, demonstrating a separation between the sample and computational complexity of parameter-free stochastic convex optimization. Combining these two methods allows us to simultaneously adapt to multiple problem structures. Experiments performing few-shot learning on CIFAR-10 by fine-tuning CLIP models and prompt engineering Gemini to count shapes indicate that our reliable model selection method can help mitigate overfitting to small validation sets.