🤖 AI Summary
This paper addresses dynamic customer routing optimization in skill-based service queueing systems (e.g., cloud data centers), where static policies fail to adapt to workload fluctuations and heterogeneous agent skills. We propose a UCB-based reinforcement learning routing algorithm that jointly estimates environment dynamics, balances multiple objectives—namely, waiting time minimization and load balancing—and incorporates parameter sensitivity analysis. To accelerate convergence and enhance robustness, we innovatively integrate heuristic rules into the exploration mechanism. Extensive experiments driven by real-world operational data demonstrate that our algorithm significantly outperforms standard baselines in both efficiency and adaptability. It achieves rapid online learning and self-adjustment under varying traffic conditions, validating its practical feasibility and effectiveness for deployment in complex, large-scale service systems.
📝 Abstract
This paper is about optimally controlling skill-based queueing systems such as data centers, cloud computing networks, and service systems. By means of a case study using a real-world data set, we investigate the practical implementation of a recently developed reinforcement learning algorithm for optimal customer routing. Our experiments show that the algorithm efficiently learns and adapts to changing environments and outperforms static benchmark policies, indicating its potential for live implementation. We also augment the real-world applicability of this algorithm by introducing a new heuristic routing rule to reduce delays. Moreover, we show that the algorithm can optimize for multiple objectives: next to payoff maximization, secondary objectives such as server load fairness and customer waiting time reduction can be incorporated. Tuning parameters are used for balancing inherent performance trade--offs. Lastly, we investigate the sensitivity to estimation errors and parameter tuning, providing valuable insights for implementing adaptive routing algorithms in complex real-world queueing systems.