🤖 AI Summary
This work addresses the inefficiency and suboptimal scheduling quality of existing methods in large-scale job scheduling by proposing a reinforcement learning framework that integrates multi-scale representation learning with an aggregation mechanism. The framework employs self-attention, convolutional, and cross-attention modules to model internal features and interactions among job operations and machines, and aggregates multi-source information across multiple scales to inform scheduling decisions. Experimental results demonstrate that the proposed method significantly outperforms state-of-the-art approaches, reducing the optimality gap by 13.0% on small-to-medium instances and by 78.6% on large-scale instances, achieving average optimality gaps of 7.3% and 2.1%, respectively.
📝 Abstract
Job scheduling is widely used in real-world manufacturing systems to assign ordered job operations to machines under various constraints. Existing solutions remain limited by long running time or insufficient schedule quality, especially when problem scale increases. In this paper, we propose ReLA, a reinforcement-learning (RL) scheduler built on structured representation learning and aggregation. ReLA first learns diverse representations from scheduling entities, including job operations and machines, using two intra-entity learning modules with self-attention and convolution and one inter-entity learning module with cross-attention. These modules are applied in a multi-scale architecture, and their outputs are aggregated to support RL decision-making. Across experiments on small, medium, and large job instances, ReLA achieves the best makespan in most tested settings over the latest solutions. On non-large instances, ReLA reduces the optimality gap of the SOTA baseline by 13.0%, while on large-scale instances it reduces the gap by 78.6%, with the average optimality gaps lowered to 7.3% and 2.1%, respectively. These results confirm that ReLA's learned representations and aggregation provide strong decision support for RL scheduling, and enable fast job completion and decision-making for real-world applications.