SMDP-Based Dynamic Batching for Improving Responsiveness and Energy Efficiency of Batch Services

📅 2025-01-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Online parallel computing services require joint optimization of response latency and energy consumption. Method: This paper proposes a dynamic batching framework formulated as a semi-Markov decision process (SMDP), minimizing the weighted sum of average response time and power consumption. To enhance tractability, we introduce a novel tail-state abstraction cost mechanism and design an approximate optimal algorithm. Contribution/Results: The proposed approach reduces state-space complexity by 63.5% and computational time complexity by 98%, significantly improving policy scalability and real-time adaptability. Extensive experiments demonstrate consistent superiority over static and heuristic batching baselines across diverse parameter settings—achieving up to 41% reduction in response time and over 35% improvement in energy efficiency. Moreover, it enables flexible, tunable trade-offs between latency and energy consumption.

Technology Category

Application Category

📝 Abstract
For servers incorporating parallel computing resources, batching is a pivotal technique for providing efficient and economical services at scale. Parallel computing resources exhibit heightened computational and energy efficiency when operating with larger batch sizes. However, in the realm of online services, the adoption of a larger batch size may lead to longer response times. This paper aims to provide a dynamic batching scheme that delicately balances latency and efficiency. The system is modeled as a batch service queue with size-dependent service times. Then, the design of dynamic batching is formulated as a semi-Markov decision process (SMDP) problem, with the objective of minimizing the weighted sum of average response time and average power consumption. A method is proposed to derive an approximate optimal SMDP solution, representing the chosen dynamic batching policy. By introducing an abstract cost to reflect the impact of"tail"states, the space complexity and the time complexity of the procedure can decrease by 63.5% and 98%, respectively. Numerical results showcase the superiority of SMDP-based batching policies across various parameter setups. Additionally, the proposed scheme exhibits noteworthy flexibility in balancing power consumption and latency.
Problem

Research questions and friction points this paper is trying to address.

Dynamic Batching
Energy Efficiency
Optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

SMDP-based dynamic batching
energy efficiency
computational optimization
🔎 Similar Papers
No similar papers found.
Yaodan Xu
Yaodan Xu
Tsinghua University
S
Sheng Zhou
Department of Electronic Engineering, the Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing 100084, China
Zhisheng Niu
Zhisheng Niu
Professor of Electronic Engineering, Tsinghua University
Green CommunicationRadio Resource ManagementQueueing Theory