🤖 AI Summary
This paper studies the risk-sensitive linear quadratic regulator (RS-LQR) control problem in a finite-horizon, turn-based online adaptive setting. Addressing the challenge of unknown system dynamics requiring online learning, we propose two algorithms: a greedy controller based on least-squares estimation and its exploration-enhanced variant with injected excitation noise. We establish the first theoretical regret bounds for RS-LQR, rigorously distinguishing identifiable and non-identifiable regimes: under identifiability, we achieve a logarithmic regret upper bound of $ ilde{O}(log N)$; without assumptions, we attain a sublinear $ ilde{O}(sqrt{N})$ bound. Key technical contributions include perturbation analysis of the risk-sensitive Riccati equation, precise characterization of controller performance loss, and principled exploration–exploitation trade-off design. To our knowledge, this is the first work on online adaptive RS-LQR control with provable regret guarantees.
📝 Abstract
Risk-sensitive linear quadratic regulator is one of the most fundamental problems in risk-sensitive optimal control. In this paper, we study online adaptive control of risk-sensitive linear quadratic regulator in the finite horizon episodic setting. We propose a simple least-squares greedy algorithm and show that it achieves $widetilde{mathcal{O}}(log N)$ regret under a specific identifiability assumption, where $N$ is the total number of episodes. If the identifiability assumption is not satisfied, we propose incorporating exploration noise into the least-squares-based algorithm, resulting in an algorithm with $widetilde{mathcal{O}}(sqrt{N})$ regret. To our best knowledge, this is the first set of regret bounds for episodic risk-sensitive linear quadratic regulator. Our proof relies on perturbation analysis of less-standard Riccati equations for risk-sensitive linear quadratic control, and a delicate analysis of the loss in the risk-sensitive performance criterion due to applying the suboptimal controller in the online learning process.