Fast Strategy Solving for the Informed Player in Two-Player Zero-Sum Linear-Quadratic Differential Games with One-Sided Information

📅 2026-05-04

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

This work proposes an efficient real-time subgame-solving method for two-player zero-sum linear-quadratic differential games under asymmetric information. The approach formulates the informed player’s Nash equilibrium computation as a bilevel optimization problem: the outer loop optimizes the information signaling strategy, while the inner loop computes closed-loop controls via a game-tree-based linear-quadratic regulator (LQR), leveraging the adjoint method to alternate between backward LQR sweeps and forward gradient descent updates. This framework achieves, for the first time, near-real-time approximation of Nash equilibria at approximately 10 Hz in an 8-dimensional state, 2-dimensional action, 10-stage pursuit-evasion game. The method significantly enhances robust planning capabilities in scenarios involving information asymmetry and disturbances.

📝 Abstract

We study finite-horizon two-player zero-sum differential games with one-sided payoff information ($G$), where the informed player (P1) knows the game payoff, while P2 only has a public belief over a finite set of possible payoffs. In this case, P1's Nash equilibrium (NE) behavioral strategy may control the release of the type information or even resort to manipulate P2's belief. Previous studies revealed an atomic structure of the NE of $G$ with general nonlinear dynamics and payoffs, leading to tractable NE approximation. Implementing such approximation schemes for real-time sub-game solving, however, has not been achieved, yet is desired for applications where sim-to-real gaps exist and robust control is required. This paper improves the computational efficiency of sub-game solving for P1 during $G$ with linear dynamics and quadratic losses. Specifically, we show that P1's NE computation can be formulated as a bi-level optimization problem where the outer level optimizes the "signaling" strategy, i.e., when and how to reveal information through control, and the inner level is a game-tree LQR that solves for the optimal closed-loop control. This bi-level problem is solved via an adjoint-enabled backpropagation scheme: A "backward" LQR pass is followed by a "forward" gradient descent pass for improving the signaling. We apply the proposed algorithm to approximate NEs for variants of a homing problem with a 8D state space, 2D action spaces, and a discrete time horizon of $K=10$. The algorithm achieves $\approx$10Hz sub-game solving, enabling robust game-theoretic planning under information asymmetry and random disturbances.

Problem

Research questions and friction points this paper is trying to address.

zero-sum differential games

one-sided information

sub-game solving

Nash equilibrium

real-time planning

Innovation

Methods, ideas, or system contributions that make the work stand out.

bilevel optimization

signaling strategy

LQR