🤖 AI Summary
This work addresses tail latency optimization in M/G/n multiserver queues under heavy-tailed job size distributions. Challenging the conventional paradigm of strictly prioritizing small jobs—common in single-server systems—the paper proposes a “sympathetic” scheduling policy that moderately favors large jobs. It establishes, for the first time, that granting limited priority to large jobs is essential for achieving strong tail optimality in multiserver environments. Leveraging queueing theory and heavy-tailed distribution analysis, the proposed policy attains strong tail optimality when job sizes are known and near-strong tail optimality when they are unknown, while remaining effective across the entire stability region. This approach yields substantial improvements in tail latency performance compared to existing strategies.
📝 Abstract
We study the asymptotic response time tail in the M/G/n multi-server queue with heavy-tailed (regularly varying) job sizes, a setting representative of modern computing workloads. For single-server systems, tail optimization is well understood: under heavy-tailed job sizes, policies such as SRPT that strictly prioritize short jobs are strongly tail optimal, and giving any priority to large jobs is harmful. For multi-server systems, the question has been almost entirely open.
This paper gives the first strongly tail-optimal scheduling policies for the M/G/n queue with heavy-tailed job sizes. Our central finding is that the multi-server case is intrinsically different from the single-server case: giving a small amount of ``sympathy'' to large jobs is essential for strong tail optimality. We establish strong (or arbitrarily close to strong) tail optimality across the full stability region, both with and without knowledge of job sizes.