PRISM: Dynamic Primitive-Based Forecasting for Large-Scale GPU Cluster Workloads

📅 2026-03-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
GPU workloads exhibit high volatility, multi-scale periodicity, and heterogeneity, posing significant challenges for traditional prediction methods and thereby limiting the scheduling efficiency of AI infrastructure. To address this, this work proposes PRISM, a novel framework that uniquely integrates primitive composition modeling with adaptive spectral refinement. By leveraging dictionary-driven time-series decomposition, PRISM extracts stable and interpretable multi-scale features, effectively capturing both periodic patterns and bursty behaviors inherent in complex GPU workloads. The framework establishes an architecture-aware dynamic forecasting system that substantially improves prediction accuracy on large-scale production traces, particularly reducing errors during burst phases. This advancement provides robust support for resource scheduling in GPU clusters, enhancing overall system responsiveness and efficiency.

Technology Category

Application Category

📝 Abstract
Accurately forecasting GPU workloads is essential for AI infrastructure, enabling efficient scheduling, resource allocation, and power management. Modern workloads are highly volatile, multiple periodicity, and heterogeneous, making them challenging for traditional predictors. We propose PRISM, a primitive-based compositional forecasting framework combining dictionary-driven temporal decomposition with adaptive spectral refinement. This dual representation extracts stable, interpretable workload signatures across diverse GPU jobs. Evaluated on large-scale production traces, PRISM achieves state-of-the-art results. It significantly reduces burst-phase errors, providing a robust, architecture-aware foundation for dynamic resource management in GPU-powered AI platforms.
Problem

Research questions and friction points this paper is trying to address.

GPU workload forecasting
large-scale AI infrastructure
workload volatility
heterogeneous workloads
multiple periodicity
Innovation

Methods, ideas, or system contributions that make the work stand out.

primitive-based forecasting
temporal decomposition
spectral refinement
GPU workload prediction
dynamic resource management
🔎 Similar Papers
No similar papers found.