🤖 AI Summary
Memory contention among mixed-criticality tasks on memory-constrained edge devices degrades real-time performance and throughput. Method: We propose a dynamic memory allocation strategy balancing real-time guarantees and throughput. We first identify system-level long stalls—caused by page swapping under high memory pressure—and design a PSI (Pressure Stall Information)-based latency quantification model and long-stall detection mechanism. Our approach reserves memory for soft real-time tasks with fine-grained precision, proactively discards stalled non-real-time tasks to reduce interference, and leverages periodic task characteristics for adaptive, fine-grained memory scheduling. Contribution/Results: Under stringent conditions where available memory is only 60% of peak demand, our method achieves a 97.13% deadline satisfaction rate for soft real-time tasks and improves throughput of non-real-time applications by up to 22.32×, demonstrating significant gains in both timeliness and resource efficiency.
📝 Abstract
The memory capacity in edge devices is often limited due to constraints on cost, size, and power. Consequently, memory competition leads to inevitable page swapping in memory-constrained mixed-criticality edge devices, causing slow storage I/O and thus performance degradation. In such scenarios, inefficient memory allocation disrupts the balance between application performance, causing soft real-time (soft RT) tasks to miss deadlines or preventing non-real-time (non-RT) applications from optimizing throughput. Meanwhile, we observe unpredictable, long system-level stalls (called long stalls) under high memory and I/O pressure, which further degrade performance. In this work, we propose a Stall-Aware Real-Time Memory Allocator (SARA), which discovers opportunities for performance balance by allocating just enough memory to soft RT tasks to meet deadlines and, at the same time, optimizing the remaining memory for non-RT applications. To minimize the memory usage of soft RT tasks while meeting real-time requirements, SARA leverages our insight into how latency, caused by memory insufficiency and measured by our proposed PSI-based metric, affects the execution time of each soft RT job, where a job runs per period and a soft RT task consists of multiple periods. Moreover, SARA detects long stalls using our definition and proactively drops affected jobs, minimizing stalls in task execution. Experiments show that SARA achieves an average of 97.13% deadline hit ratio for soft RT tasks and improves non-RT application throughput by up to 22.32x over existing approaches, even with memory capacity limited to 60% of peak demand.