Exploiting Dependency and Parallelism: Real-Time Scheduling and Analysis for GPU Tasks

📅 2026-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the unpredictability of GPU kernel execution times caused by inter-kernel data dependencies and resource contention, which undermines real-time guarantees. Focusing on DAG-structured GPU tasks, the paper proposes a scheduling approach that does not rely on kernel priority assumptions. By explicitly modeling kernel dependencies and co-scheduling kernel-level parallelism, the method derives tight and safe worst-case execution time (WCET) bounds. Built upon standard CUDA APIs, the approach requires only DAG task modeling, parallel kernel scheduling, and response time analysis, without additional hardware or software support. Experimental results on both synthetic and real-world benchmarks demonstrate that the proposed method reduces WCET and measured execution time by up to 32.8% and 21.3%, respectively, compared to existing techniques.

Technology Category

Application Category

📝 Abstract
With the rapid advancement of Artificial Intelligence, the Graphics Processing Unit (GPU) has become increasingly essential across a growing number of safety-critical application domains. Applying a GPU is indispensable for parallel computing; however, the complex data dependencies and resource contention across kernels within a GPU task may unpredictably delay its execution time. To address these problems, this paper presents a scheduling and analysis method for Directed Acyclic Graph (DAG)-structured GPU tasks. Given a DAG representation, the proposed scheduling scales the kernel-level parallelism and establishes inter-kernel dependencies to provide a reduced and predictable DAG response time. The corresponding timing analysis yields a safe yet nonpessimistic makespan bound without any assumption on kernel priorities. The proposed method is implemented using the standard CUDA API, requiring no additional software or hardware support. Experimental results under synthetic and real-world benchmarks demonstrate that the proposed approach effectively reduces the worst-case makespan and measured task execution time compared to the existing methods up to 32.8% and 21.3%, respectively.
Problem

Research questions and friction points this paper is trying to address.

GPU tasks
real-time scheduling
data dependencies
resource contention
DAG
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU scheduling
DAG tasks
real-time analysis
kernel-level parallelism
makespan bound
🔎 Similar Papers
No similar papers found.