🤖 AI Summary
This paper studies online non-stochastic control of population dynamics under partial observability—where only a low-dimensional representation of the state is observable, and system evolution is constrained to the probability simplex. To overcome challenges in conventional approaches—including constructing ignorance-aware signals, designing adaptive controllers, and preserving cost convexity—we propose three innovations: (1) the first construction of an adaptive constant-control reference signal on the simplex; (2) a controller parameterization linear in non-blind observations; and (3) a convex proxy loss function that avoids projection-induced non-convexity. We theoretically establish that our method achieves the optimal $ ilde{O}(sqrt{T})$ regret bound within the class of mixed linear-dynamic controllers. This result significantly strengthens both theoretical guarantees and practical performance for partially observable population control.
📝 Abstract
We study the problem of controlling population dynamics, a class of linear dynamical systems evolving on the probability simplex, from the perspective of online non-stochastic control. While Golowich et.al. 2024 analyzed the fully observable setting, we focus on the more realistic, partially observable case, where only a low-dimensional representation of the state is accessible. In classical non-stochastic control, inputs are set as linear combinations of past disturbances. However, under partial observations, disturbances cannot be directly computed. To address this, Simchowitz et.al. 2020 proposed to construct oblivious signals, which are counterfactual observations with zero control, as a substitute. This raises several challenges in our setting: (1) how to construct oblivious signals under simplex constraints, where zero control is infeasible; (2) how to design a sufficiently expressive convex controller parameterization tailored to these signals; and (3) how to enforce the simplex constraint on control when projections may break the convexity of cost functions. Our main contribution is a new controller that achieves the optimal $ ilde{O}(sqrt{T})$ regret with respect to a natural class of mixing linear dynamic controllers. To tackle these challenges, we construct signals based on hypothetical observations under a constant control adapted to the simplex domain, and introduce a new controller parameterization that approximates general control policies linear in non-oblivious observations. Furthermore, we employ a novel convex extension surrogate loss, inspired by Lattimore 2024, to bypass the projection-induced convexity issue.