A Stackelberg Game Framework with Drainability Guardrails for Pricing and Scaling in Multi-Tenant GPU Cloud Platforms

πŸ“… 2026-04-17
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

235K/year
πŸ€– AI Summary
This work addresses the trade-off among latency-sensitive service-level objectives, redundant capacity costs, and endogenous workload dynamics induced by pricing in multi-tenant GPU cloud platforms. The authors formulate joint pricing and scaling decisions as a mean-field Stackelberg game and derive the equilibrium demand mapping. They uncover, for the first time, a structural failure mode wherein delay-insensitive tasks cause unresolvable backlogs, and propose a verifiable drainability guardrail together with an optimizer-agnostic action masking mechanism to guarantee a uniformly negative drift in residual demand regions. Theoretically, they prove that for any price-capacity pair satisfying the guardrail, the system admits a unique steady state and converges globally to it. Experiments demonstrate that the proposed approach substantially enhances the safety and robustness of reinforcement learning policies in dynamic environments.

Technology Category

Application Category

πŸ“ Abstract
Modern Graphics Processing Unit (GPU)-backed services must satisfy strict latency service-level objectives (SLOs) while controlling spare-capacity cost. In multi-tenant GPU cloud platforms, this trade-off is inherently dynamic because workload demand is endogenous; specifically, pricing shapes the submissions of heterogeneous tenants, which subsequently impact congestion and delay. We formulate the joint pricing-and-scaling problem as a large-population Stackelberg game problem, and we derive an explicit equilibrium demand map. The resulting closed-loop model reveals a structural failure mode in which delay-insensitive workloads sustain a residual demand floor, making the backlog undrainable under bounded price and service capacity. This observation motivates a computable drainability guardrail that certifies uniformly negative drift in the residual-demand regime. For any fixed price-capacity pair satisfying the drainability guardrail, we establish a unique operating point and global convergence towards it under a checkable step-size condition. Building on this fixed-pair analysis, we further develop an optimizer-agnostic action shield for the full dynamic problem and show empirically that it improves safety and robustness for model-free reinforcement learning (RL) in this setting.
Problem

Research questions and friction points this paper is trying to address.

multi-tenant GPU cloud
service-level objectives
drainability
pricing and scaling
backlog undrainability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Stackelberg game
drainability guardrail
multi-tenant GPU cloud
pricing and scaling
reinforcement learning safety
πŸ”Ž Similar Papers
No similar papers found.