A Stackelberg Game Framework with Drainability Guardrails for Pricing and Scaling in Multi-Tenant GPU Cloud Platforms

📅 2026-04-17

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This work addresses the trade-off among latency-sensitive service-level objectives, redundant capacity costs, and endogenous workload dynamics induced by pricing in multi-tenant GPU cloud platforms. The authors formulate joint pricing and scaling decisions as a mean-field Stackelberg game and derive the equilibrium demand mapping. They uncover, for the first time, a structural failure mode wherein delay-insensitive tasks cause unresolvable backlogs, and propose a verifiable drainability guardrail together with an optimizer-agnostic action masking mechanism to guarantee a uniformly negative drift in residual demand regions. Theoretically, they prove that for any price-capacity pair satisfying the guardrail, the system admits a unique steady state and converges globally to it. Experiments demonstrate that the proposed approach substantially enhances the safety and robustness of reinforcement learning policies in dynamic environments.

Technology Category

Application Category

📝 Abstract

Modern Graphics Processing Unit (GPU)-backed services must satisfy strict latency service-level objectives (SLOs) while controlling spare-capacity cost. In multi-tenant GPU cloud platforms, this trade-off is inherently dynamic because workload demand is endogenous; specifically, pricing shapes the submissions of heterogeneous tenants, which subsequently impact congestion and delay. We formulate the joint pricing-and-scaling problem as a large-population Stackelberg game problem, and we derive an explicit equilibrium demand map. The resulting closed-loop model reveals a structural failure mode in which delay-insensitive workloads sustain a residual demand floor, making the backlog undrainable under bounded price and service capacity. This observation motivates a computable drainability guardrail that certifies uniformly negative drift in the residual-demand regime. For any fixed price-capacity pair satisfying the drainability guardrail, we establish a unique operating point and global convergence towards it under a checkable step-size condition. Building on this fixed-pair analysis, we further develop an optimizer-agnostic action shield for the full dynamic problem and show empirically that it improves safety and robustness for model-free reinforcement learning (RL) in this setting.

Problem

Research questions and friction points this paper is trying to address.

multi-tenant GPU cloud

service-level objectives

drainability

pricing and scaling

backlog undrainability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Stackelberg game

drainability guardrail

multi-tenant GPU cloud