๐ค AI Summary
This paper studies the online contract design problem, where a principal seeks to maximize utility by learning optimal contracts through multi-round interactions with agents of unknown types (i.e., unknown cost and output functions). We establish the first theoretical equivalence between online contract design and dynamic pricing, Lipschitz multi-armed bandits, and polynomial sample-complexity learning. We show that one-dimensional effort spaces are particularly amenable to learning-driven contract design. Our method leverages nonparametric estimation and regularity analysis to develop efficient algorithms: for binary outcomes, it achieves optimal learning of linear contracts; for homogeneous and heterogeneous agents, it attains near-optimal contracts with provably efficient convergence. We rigorously characterize sample complexity and regret bounds, providing the first systematic theoretical framework for learning-based principalโagent mechanisms.
๐ Abstract
This work studies the repeated principal-agent problem from an online learning perspective. The principal's goal is to learn the optimal contract that maximizes her utility through repeated interactions, without prior knowledge of the agent's type (i.e., the agent's cost and production functions). This work contains three technical results. First, learning linear contracts with binary outcomes is equivalent to dynamic pricing with an unknown demand curve. Second, learning an approximately optimal contract with identical agents can be accomplished with a polynomial sample complexity scheme. Third, learning the optimal contract with heterogeneous agents can be reduced to Lipschitz bandits under mild regularity conditions. The technical results demonstrate that the one-dimensional effort model, the default model for principal-agent problems in economics which seems largely ignored in recent works from the computer science community, may possibly be the more suitable choice when studying contract design from a learning perspective.