No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!

📅 2025-06-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies online decision-making under resource constraints where rewards and costs evolve adversarially over time—a setting in which standard regret bounds fail to guarantee sublinear growth. To address this, we propose a novel learning paradigm guided by a *prescribed expenditure plan* as the benchmark—first of its kind. Our method leverages a primal-dual framework with three key innovations: (i) budget-balance awareness, (ii) robust perturbation compensation, and (iii) dynamic step-size adaptation. Under both full-information and bandit-feedback settings, the algorithm achieves $O(sqrt{T})$ sublinear regret relative to the plan-following benchmark. Theoretical analysis establishes robustness under highly skewed budget allocations, and empirical evaluation demonstrates substantial improvements over plan-agnostic baselines. Our core contribution is the formalization of plan-driven regulation mechanisms and a systematic characterization of regret bounds under deviation from the prescribed expenditure plan.

Technology Category

Application Category

📝 Abstract
We study online decision making problems under resource constraints, where both reward and cost functions are drawn from distributions that may change adversarially over time. We focus on two canonical settings: $(i)$ online resource allocation where rewards and costs are observed before action selection, and $(ii)$ online learning with resource constraints where they are observed after action selection, under full feedback or bandit feedback. It is well known that achieving sublinear regret in these settings is impossible when reward and cost distributions may change arbitrarily over time. To address this challenge, we analyze a framework in which the learner is guided by a spending plan--a sequence prescribing expected resource usage across rounds. We design general (primal-)dual methods that achieve sublinear regret with respect to baselines that follow the spending plan. Crucially, the performance of our algorithms improves when the spending plan ensures a well-balanced distribution of the budget across rounds. We additionally provide a robust variant of our methods to handle worst-case scenarios where the spending plan is highly imbalanced. To conclude, we study the regret of our algorithms when competing against benchmarks that deviate from the prescribed spending plan.
Problem

Research questions and friction points this paper is trying to address.

Study online decision making under adversarial resource constraints
Design algorithms achieving sublinear regret with spending plans
Handle worst-case scenarios with imbalanced spending plans
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spending plan guides resource allocation
Primal-dual methods ensure sublinear regret
Robust variant handles imbalanced spending plans
🔎 Similar Papers
No similar papers found.