Coordinated Power Management on Heterogeneous Systems

📅 2025-08-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional performance modeling for heterogeneous CPU-GPU systems suffers from prohibitively large search spaces and high profiling overhead, limiting practical deployment. To address this, we propose OPEN: a novel framework that integrates offline coarse-grained modeling with lightweight online sampling to construct an initial performance matrix, and innovatively incorporates collaborative filtering to enable cross-configuration and cross-application knowledge transfer and dynamic optimization. Evaluated on A100/A30 heterogeneous platforms, OPEN achieves up to 98.29% prediction accuracy while reducing profiling overhead by over an order of magnitude compared to exhaustive profiling. Unlike full-system analysis, OPEN maintains high accuracy while significantly improving deployment efficiency and enabling real-time, power-constrained runtime decisions. Our approach establishes a scalable, low-overhead performance prediction paradigm for large-scale heterogeneous systems.

Technology Category

Application Category

📝 Abstract
Performance prediction is essential for energy-efficient computing in heterogeneous computing systems that integrate CPUs and GPUs. However, traditional performance modeling methods often rely on exhaustive offline profiling, which becomes impractical due to the large setting space and the high cost of profiling large-scale applications. In this paper, we present OPEN, a framework consists of offline and online phases. The offline phase involves building a performance predictor and constructing an initial dense matrix. In the online phase, OPEN performs lightweight online profiling, and leverages the performance predictor with collaborative filtering to make performance prediction. We evaluate OPEN on multiple heterogeneous systems, including those equipped with A100 and A30 GPUs. Results show that OPEN achieves prediction accuracy up to 98.29%. This demonstrates that OPEN effectively reduces profiling cost while maintaining high accuracy, making it practical for power-aware performance modeling in modern HPC environments. Overall, OPEN provides a lightweight solution for performance prediction under power constraints, enabling better runtime decisions in power-aware computing environments.
Problem

Research questions and friction points this paper is trying to address.

Predict performance in CPU-GPU heterogeneous systems efficiently
Reduce costly offline profiling for large-scale applications
Enable power-aware performance modeling with high accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines offline and online profiling phases
Uses performance predictor with collaborative filtering
Achieves high accuracy with reduced profiling cost
🔎 Similar Papers
No similar papers found.