Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part II

📅 2026-03-08

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This work addresses the challenge of learning low-dimensional state representations from partial high-dimensional observations for control, specifically targeting the infinite-horizon linear quadratic Gaussian (LQG) control problem. The authors propose a cost-driven approach to state representation learning, wherein a latent state space is constructed by predicting cumulative costs, enabling the design of a near-optimal controller. The method encompasses both explicit and implicit dynamic modeling, with the latter inspired by MuZero. Additionally, the paper establishes the persistent excitation property for a novel stochastic process, offering independent theoretical value. By integrating quadratic regression with finite-sample analysis, this study provides the first finite-sample performance guarantees for both the representation function and the controller in time-invariant LQG settings.

Technology Category

Application Category

📝 Abstract

We study the problem of state representation learning for control from partial and potentially high-dimensional observations. We approach this problem via cost-driven state representation learning, in which we learn a dynamical model in a latent state space by predicting cumulative costs. In particular, we establish finite-sample guarantees on finding a near-optimal representation function and a near-optimal controller using the learned latent model for infinite-horizon time-invariant Linear Quadratic Gaussian (LQG) control. We study two approaches to cost-driven representation learning, which differ in whether the transition function of the latent state is learned explicitly or implicitly. The first approach has also been investigated in Part I of this work, for finite-horizon time-varying LQG control. The second approach closely resembles MuZero, a recent breakthrough in empirical reinforcement learning, in that it learns latent dynamics implicitly by predicting cumulative costs. A key technical contribution of this Part II is to prove persistency of excitation for a new stochastic process that arises from the analysis of quadratic regression in our approach, and may be of independent interest.

Problem

Research questions and friction points this paper is trying to address.

state representation learning

Linear Quadratic Gaussian control

partial observations

high-dimensional observations

infinite-horizon control

Innovation

Methods, ideas, or system contributions that make the work stand out.

cost-driven representation learning

Linear Quadratic Gaussian control

latent dynamics