🤖 AI Summary
This work addresses the challenge of real-time scheduling under stringent computational budgets, where traditional methods struggle to balance efficiency and adaptability in the face of dynamically arriving, unordered task sets. The authors propose a novel scheduler that integrates a permutation-invariant Transformer with deep Q-learning. By encoding temporal slack into learnable embeddings via urgency tokens, and employing delay-aware block-wise Top-k sparse attention combined with locality-sensitive chunking, the model achieves near-linear scalability and sub-millisecond inference latency. A multi-head mapping layer produces interpretable processor assignments and supports both behavior cloning pretraining and an actor-critic reinforcement learning framework. Evaluated on industrial-scale mixed-criticality and large-scale multiprocessor workloads, the approach significantly outperforms conventional analytical schedulers and existing neural baselines in deadline satisfaction rate, optimization stability, and sample efficiency.
📝 Abstract
Real-time schedulers must reason about tight deadlines under strict compute budgets. We present TempoNet, a reinforcement learning scheduler that pairs a permutation-invariant Transformer with a deep Q-approximation. An Urgency Tokenizer discretizes temporal slack into learnable embeddings, stabilizing value learning and capturing deadline proximity. A latency-aware sparse attention stack with blockwise top-k selection and locality-sensitive chunking enables global reasoning over unordered task sets with near-linear scaling and sub-millisecond inference. A multicore mapping layer converts contextualized Q-scores into processor assignments through masked-greedy selection or differentiable matching. Extensive evaluations on industrial mixed-criticality traces and large multiprocessor settings show consistent gains in deadline fulfillment over analytic schedulers and neural baselines, together with improved optimization stability. Diagnostics include sensitivity analyses for slack quantization, attention-driven policy interpretation, hardware-in-the-loop and kernel micro-benchmarks, and robustness under stress with simple runtime mitigations; we also report sample-efficiency benefits from behavioral-cloning pretraining and compatibility with an actor-critic variant without altering the inference pipeline. These results establish a practical framework for Transformer-based decision making in high-throughput real-time scheduling.