On the Gradient Domination of the LQG Problem

📅 2025-07-11

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Policy gradient (PG) methods for the linear-quadratic-Gaussian (LQG) regulator suffer from a lack of gradient dominance and provable global convergence under classical dynamic controller parameterizations. Method: We propose a novel “history-based parameterization” leveraging historical input-output data, enabling gradient dominance and approximate smoothness of the cost function within the LQG framework. This is achieved by combining historical state lifting with geometric analysis of the stable controller set. Contribution/Results: We establish, for the first time in LQG, rigorous global convergence and per-iteration stability guarantees for PG algorithms—both in model-based and model-free settings. Theoretical analysis and numerical experiments demonstrate robust convergence across varying history lengths, even for open-loop unstable systems, thereby overcoming fundamental limitations of conventional parameterizations.

Technology Category

Application Category

📝 Abstract

We consider solutions to the linear quadratic Gaussian (LQG) regulator problem via policy gradient (PG) methods. Although PG methods have demonstrated strong theoretical guarantees in solving the linear quadratic regulator (LQR) problem, despite its nonconvex landscape, their theoretical understanding in the LQG setting remains limited. Notably, the LQG problem lacks gradient dominance in the classical parameterization, i.e., with a dynamic controller, which hinders global convergence guarantees. In this work, we study PG for the LQG problem by adopting an alternative parameterization of the set of stabilizing controllers and employing a lifting argument. We refer to this parameterization as a history representation of the control input as it is parameterized by past input and output data from the previous p time-steps. This representation enables us to establish gradient dominance and approximate smoothness for the LQG cost. We prove global convergence and per-iteration stability guarantees for policy gradient LQG in model-based and model-free settings. Numerical experiments on an open-loop unstable system are provided to support the global convergence guarantees and to illustrate convergence under different history lengths of the history representation.

Problem

Research questions and friction points this paper is trying to address.

LQG problem lacks gradient dominance in classical parameterization

Study PG for LQG using alternative controller parameterization

Establish gradient dominance and smoothness for LQG cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

Alternative parameterization of stabilizing controllers

History representation using past input-output data

Gradient dominance and approximate smoothness establishment

🔎 Similar Papers

Primal Methods for Variational Inequality Problems with Functional Constraints