🤖 AI Summary
This work investigates the convexity of the objective value sequence—i.e., the optimization trajectory—when minimizing smooth convex functions via gradient descent. Contrary to common intuition, we establish that trajectory convexity is not inherent but critically depends on step size selection, exhibiting a sharp phase transition within standard convergent step-size regimes. Using gradient-flow modeling, convex analysis, and Lyapunov function techniques, we rigorously characterize the critical step-size interval guaranteeing trajectory convexity and further analyze monotonicity properties of the gradient norm. Our results yield a theoretical criterion for early stopping: excessively large step sizes may induce non-convex trajectory behaviors—such as plateaus followed by steep descents—that mislead convergence assessment; conversely, restricting the step size within the derived threshold ensures trajectory convexity, thereby enhancing the reliability and interpretability of early stopping.
📝 Abstract
In this paper, we study when we might expect the optimization curve induced by gradient descent to be emph{convex} -- precluding, for example, an initial plateau followed by a sharp decrease, making it difficult to decide when optimization should stop. Although such undesirable behavior can certainly occur when optimizing general functions, might it also occur in the benign and well-studied case of smooth convex functions? As far as we know, this question has not been tackled in previous work. We show, perhaps surprisingly, that the answer crucially depends on the choice of the step size. In particular, for the range of step sizes which are known to result in monotonic convergence to an optimal value, there is a regime where the optimization curve will be provably convex, and there is a regime where the curve can be non-convex. We also extend our results to gradient flow, and to the closely-related but different question of whether the gradient norm decreases monotonically.