Dueling over Multiple Pieces of Dessert

📅 2026-02-12

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This study investigates how a cake-cutting agent (Alice) can minimize regret relative to the Stackelberg value in repeated cake-cutting games when the opponent’s (Bob’s) preferences are unknown. Integrating online learning theory, Stackelberg game dynamics, and the Robertson–Webb query model, the work analyzes achievable regret bounds under limited numbers of cuts and varying strategic complexities of Bob. The main contributions include proving that strong sublinear regret is unattainable under any measurable division protocol, establishing a polynomial regret hierarchy governed jointly by the number of cuts and Bob’s regret budget, and revealing a fundamental trade-off between sublinear and polynomial regret when Bob’s learning rate is private. The regret lower bound is Ω(T / log²T); with public learning rates and a fixed k-cut constraint, polynomial regret is achievable, whereas only O(T / logT) regret can be guaranteed when the learning rate is private.

Technology Category

Application Category

📝 Abstract

We study the dynamics of repeated fair division between two players, Alice and Bob, where Alice partitions a cake into two subsets and Bob chooses his preferred one over $T$ rounds. Alice aims to minimize her regret relative to the Stackelberg value -- the maximum utility she could achieve if she knew Bob's private valuation. We show that if Alice uses arbitrary measurable partitions, achieving strongly sublinear regret is impossible; she suffers a regret of $\Omega\Bigl(\frac{T}{\log^2 T}\Bigr)$ regret even against a myopic Bob. However, when Alice uses at most $k$ cuts, the learning landscape becomes tractable. We analyze Alice's performance based on her knowledge of Bob's strategic sophistication (his regret budget). When Bob's learning rate is public, we establish a hierarchy of polynomial regret bounds determined by $k$ and Bob's regret budget. In contrast, when this learning rate is private, Alice can universally guarantee $O\Bigl(\frac{T}{\log T}\Bigr)$ regret, but any attempt to secure a polynomial rate $O(T^\beta)$ (for $\beta<1$) leaves her vulnerable to incurring strictly linear regret against some Bob. Finally, as a corollary of our online learning dynamics, we characterize the randomized query complexity of finding approximate Stackelberg allocations with a constant number of cuts in the Robertson-Webb model.

Problem

Research questions and friction points this paper is trying to address.

fair division

regret minimization

Stackelberg equilibrium

online learning

cake cutting

Innovation

Methods, ideas, or system contributions that make the work stand out.

online learning

fair division

Stackelberg equilibrium