🤖 AI Summary
This paper studies how a principal can maximize utility when repeatedly interacting with agents employing mean-based learning. To overcome the exponential computational complexity inherent in computing exact Stackelberg equilibria, we introduce the novel concept of *local Stackelberg equilibrium* and develop a polynomial-time approximation scheme (PTAS) within a smoothed analysis framework. Our method reduces equilibrium computation complexity to polynomial in the agent’s action space size and provides rigorous convergence guarantees to an ε-approximate local equilibrium. We further establish that the exponential dependence of runtime on 1/ε is fundamentally unavoidable. This work delivers the first strategy optimization paradigm for principal–agent multi-agent interactions that simultaneously offers strong theoretical guarantees—namely, approximation accuracy and convergence—and practical computational tractability.
📝 Abstract
Motivated by the question of how a principal can maximize its utility in repeated interactions with a learning agent, we study repeated games between an principal and an agent employing a mean-based learning algorithm. Prior work has shown that computing or even approximating the global Stackelberg value in similar settings can require an exponential number of rounds in the size of the agent's action space, making it computationally intractable. In contrast, we shift focus to the computation of local Stackelberg equilibria and introduce an algorithm that, within the smoothed analysis framework, constitutes a Polynomial Time Approximation Scheme (PTAS) for finding an epsilon-approximate local Stackelberg equilibrium. Notably, the algorithm's runtime is polynomial in the size of the agent's action space yet exponential in (1/epsilon) - a dependency we prove to be unavoidable.