🤖 AI Summary
This work addresses the unpredictability of latency and deadline violations in GPU-accelerated real-time systems—caused by non-preemptive execution, uncertain task execution times, and contention for shared resources. To tackle these challenges, we propose a holistic scheduling–resource–synchronization co-optimization framework. Our approach integrates a real-time–aware GPU task scheduler, fine-grained memory and compute resource isolation mechanisms, and a low-overhead CPU–GPU cross-domain synchronization protocol. Experimental evaluation demonstrates that our framework achieves high throughput while reducing worst-case response time by up to 47% and decreasing deadline violation rates by two orders of magnitude. This work delivers the first GPU runtime support solution that simultaneously ensures strong real-time guarantees and high computational utilization—enabling time-critical applications such as machine learning inference and autonomous driving. It establishes both theoretical foundations and practical engineering pathways for trustworthy real-time GPU computing.
📝 Abstract
In this work, we survey the role of GPUs in real-time systems. Originally designed for parallel graphics workloads, GPUs are now widely used in time-critical applications such as machine learning, autonomous vehicles, and robotics due to their high computational throughput. Their parallel architecture is well-suited for accelerating complex tasks under strict timing constraints. However, their integration into real-time systems presents several challenges, including non-preemptive execution, execution time variability, and resource contention; factors that can lead to unpredictable delays and deadline violations. We examine existing solutions that address these challenges, including scheduling algorithms, resource management techniques, and synchronization methods, and highlight open research directions to improve GPU predictability and performance in real-time environments.