Bi-Level Online Provisioning and Scheduling with Switching Costs and Cross-Level Constraints

📅 2026-01-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of coupling slow resource provisioning with fast scheduling decisions in network resource allocation, where the former is constrained by switching costs and the latter must satisfy dynamically evolving budget constraints. To this end, we propose the first bilevel online learning framework that integrates Online Convex Optimization (OCO) with Constrained Markov Decision Processes (CMDPs). The upper level performs budget allocation via OCO with switching costs, while the lower level executes state-dependent safe scheduling based on CMDPs. A novel dual feedback mechanism propagates sensitivity information of budget multipliers across layers to enforce cross-level constraint coupling. Additionally, we introduce a budget-adaptive safe exploration strategy to handle dynamic constraints. Theoretical analysis shows that the proposed method achieves near-optimal cumulative regret while satisfying cross-level constraints with high probability, offering dual guarantees on both performance and feasibility.

Technology Category

Application Category

📝 Abstract
We study a bi-level online provisioning and scheduling problem motivated by network resource allocation, where provisioning decisions are made at a slow time scale while queue-/state-dependent scheduling is performed at a fast time scale. We model this two-time-scale interaction using an upper-level online convex optimization (OCO) problem and a lower-level constrained Markov decision process (CMDP). Existing OCO typically assumes stateless decisions and thus cannot capture MDP network dynamics such as queue evolution. Meanwhile, CMDP algorithms typically assume a fixed constraint threshold, whereas in provisioning-and-scheduling systems, the threshold varies with online budget decisions. To address these gaps, we study bi-level OCO-CMDP learning under switching costs (budget reprovisioning/system reconfiguration) and cross-level constraints that couple budgets to scheduling decisions. Our new algorithm solves this learning problem via several non-trivial developments, including a carefully designed dual feedback that returns the budget multiplier as sensitivity information for the upper-level update and a lower level that solves a budget-adaptive safe exploration problem via an extended occupancy-measure linear program. We establish near-optimal regret and high-probability satisfaction of the cross-level constraints.
Problem

Research questions and friction points this paper is trying to address.

bi-level online optimization
switching costs
cross-level constraints
resource provisioning
constrained Markov decision process
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bi-level Online Learning
Online Convex Optimization
Constrained MDP
Switching Costs
Cross-level Constraints
🔎 Similar Papers
No similar papers found.
J
Jialei Liu
Dept. of Electrical Engineering, University at Buffalo, Buffalo, NY
C
C. E. Koksal
Dept. of Electrical and Computer Engineering, The Ohio State University, Columbus, OH
Ming Shi
Ming Shi
Assistant Professor, The State University of New York at Buffalo
Learning TheoryOnline OptimizationNetworkingSecurity