🤖 AI Summary
Addressing the lack of a general algorithmic design framework for composite optimization, this paper proposes a systematic transformation framework that converts optimal first-order methods for unconstrained smooth optimization into algorithms for composite optimization. The framework unifies convergence analysis via algebraic identities, eliminating the need for problem-specific redesign or reproof, and naturally generalizes existing methods. It establishes, for the first time, a structural connection between optimal methods for smooth and composite optimization, revealing and exploiting a step-size acceleration mechanism that significantly improves the convergence rate of FISTA-type algorithms—and breaks the current best-known convergence rate bound for gradient norm minimization. Integrating proximal-gradient ideas with computer-assisted algebraic verification, the framework yields several new algorithms. Theoretical analysis proves their superiority over classical methods in both objective-value convergence and gradient-norm decay, demonstrating the framework’s generality and effectiveness.
📝 Abstract
Recent advances in convex optimization have leveraged computer-assisted proofs to develop optimized first-order methods that improve over classical algorithms. However, each optimized method is specially tailored for a particular problem setting, and it is a well-documented challenge to extend optimized methods to other settings due to their highly bespoke design and analysis. We provide a general framework that derives optimized methods for composite optimization directly from those for unconstrained smooth optimization. The derived methods naturally extend the original methods, generalizing how proximal gradient descent extends gradient descent. The key to our result is certain algebraic identities that provide a unified and straightforward way of extending convergence analyses from unconstrained to composite settings. As concrete examples, we apply our framework to establish (1) the phenomenon of stepsize acceleration for proximal gradient descent; (2) a convergence rate for the proximal optimized gradient method which is faster than FISTA; (3) a new method that improves the state-of-the-art rate for minimizing gradient norm in the composite setting.