🤖 AI Summary
This paper studies contextual sequential contract design in principal–agent Stackelberg games: the principal dynamically designs incentive contracts without knowledge of the agent’s true cost and utility, relying solely on contextual features and historical feedback. Addressing the “context–action degeneracy” phenomenon—where certain actions become strictly dominated by adversarial contexts in high-dimensional settings—we identify a novel double-exponential separation in learning hardness, far exceeding that of classical contextual pricing. Leveraging a context-search framework and pessimistic Stackelberg regret analysis, we establish an asymptotically tight regret bound of $T^{1-Theta(1/d)}$ over $T$ rounds and $d$-dimensional contexts. In contrast, we prove that the standard Stackelberg regret lower bound is only $Omega(T^{1/2 - 1/(2d)})$, revealing a fundamental structural learning bottleneck and delivering a theoretical breakthrough in contextual mechanism design.
📝 Abstract
In this work, we introduce and study contextual search in general principal-agent games, where a principal repeatedly interacts with agents by offering contracts based on contextual information and historical feedback, without knowing the agents' true costs or rewards. Our model generalizes classical contextual pricing by accommodating richer agent action spaces. Over $T$ rounds with $d$-dimensional contexts, we establish an asymptotically tight exponential $T^{1 - Θ(1/d)}$ bound in terms of the pessimistic Stackelberg regret, benchmarked against the best utility for the principal that is consistent with the observed feedback.
We also establish a lower bound of $Ω(T^{frac{1}{2}-frac{1}{2d}})$ on the classic Stackelberg regret for principal-agent games, demonstrating a surprising double-exponential hardness separation from the contextual pricing problem (a.k.a, the principal-agent game with two actions), which is known to admit a near-optimal $O(dloglog T)$ regret bound [Kleinberg and Leighton, 2003, Leme and Schneider, 2018, Liu et al., 2021]. In particular, this double-exponential hardness separation occurs even in the special case with three actions and two-dimensional context. We identify that this significant increase in learning difficulty arises from a structural phenomenon that we call contextual action degeneracy, where adversarially chosen contexts can make some actions strictly dominated (and hence unincentivizable), blocking the principal's ability to explore or learn about them, and fundamentally limiting learning progress.