🤖 AI Summary
This paper investigates online learning in repeated karma auctions, where karma—artificial credit without intrinsic monetary value—is dynamically expended (via bidding) and earned (via yielding), requiring users to jointly learn its latent value and optimal bidding strategy. Unlike conventional monetary auctions that assume exogenous, fixed credit valuations, we propose the first adaptive karma regulation mechanism that endogenizes value evolution and enables dynamic karma reallocation. Methodologically, we integrate stochastic approximation theory with game-theoretic dynamics analysis to rigorously establish asymptotic optimality of individual strategies, convergence of multi-user learning processes, and derive an approximate Nash equilibrium in large populations. Experiments demonstrate that our mechanism ensures both individual rationality and system stability under heterogeneous user preferences and behaviors.
📝 Abstract
Recent years have seen a surge of artificial currency-based mechanisms in contexts where monetary instruments are deemed unfair or inappropriate, e.g., in allocating food donations to food banks, course seats to students, and, more recently, even for traffic congestion management. Yet the applicability of these mechanisms remains limited in repeated auction settings, as it is challenging for users to learn how to bid an artificial currency that has no value outside the auctions. Indeed, users must jointly learn the value of the currency in addition to how to spend it optimally. Moreover, in the prominent class of karma mechanisms, in which artificial karma payments are redistributed to users at each time step, users do not only spend karma to obtain public resources but also gain karma for yielding them. For this novel class of karma auctions, we propose an adaptive karma pacing strategy that learns to bid optimally, and show that this strategy a) is asymptotically optimal for a single user bidding against competing bids drawn from a stationary distribution; b) leads to convergent learning dynamics when all users adopt it; and c) constitutes an approximate Nash equilibrium as the number of users grows. Our results require a novel analysis in comparison to adaptive pacing strategies in monetary auctions, since we depart from the classical assumption that the currency has known value outside the auctions, and consider that the currency is both spent and gained through the redistribution of payments.