🤖 AI Summary
This work proposes the Constrained Dual Unfolding (CDU) framework to address key challenges in constrained optimization, including the slow convergence of classical dual ascent algorithms and the unknown distribution of dual multipliers. CDU jointly approximates the saddle point of the Lagrangian via two coupled neural networks: a primal network that simulates optimization iterations for a given multiplier, and a dual network that generates an optimal multiplier trajectory layer by layer, invoking the primal network at each layer. By innovatively embedding dual ascent dynamics into an unfolding architecture and incorporating constraint learning to enforce descent in primal variables and ascent in dual variables, CDU mitigates uncertainty in multiplier distributions through an alternating training strategy. Experiments on mixed-integer quadratic programming and wireless power allocation demonstrate that CDU yields near-optimal, feasible solutions with strong out-of-distribution generalization.
📝 Abstract
In this paper, we develop unrolled neural networks to solve constrained optimization problems, offering accelerated, learnable counterparts to dual ascent (DA) algorithms. Our framework, termed constrained dual unrolling (CDU), comprises two coupled neural networks that jointly approximate the saddle point of the Lagrangian. The primal network emulates an iterative optimizer that finds a stationary point of the Lagrangian for a given dual multiplier, sampled from an unknown distribution. The dual network generates trajectories towards the optimal multipliers across its layers while querying the primal network at each layer. Departing from standard unrolling, we induce DA dynamics by imposing primal-descent and dual-ascent constraints through constrained learning. We formulate training the two networks as a nested optimization problem and propose an alternating procedure that updates the primal and dual networks in turn, mitigating uncertainty in the multiplier distribution required for primal network training. We numerically evaluate the framework on mixed-integer quadratic programs (MIQPs) and power allocation in wireless networks. In both cases, our approach yields near-optimal near-feasible solutions and exhibits strong out-of-distribution (OOD) generalization.