🤖 AI Summary
Dynamic Flexible Job Shop Scheduling (DFJSP), an NP-hard problem, faces dual challenges of real-time disturbance response and complex routing optimization. Existing approaches suffer from limited adaptability in heuristic rules, poor interpretability and reliance on handcrafted features in deep learning, and myopic decision-making, insufficient long-context utilization, and weak integration of domain expertise in large language models (LLMs). This paper proposes ReflecSched, the first framework integrating strategic experience summarization and a hierarchical reflection architecture to tightly couple LLM-based reasoning with heuristic simulation and multi-step rolling horizon planning. It enhances decision quality via natural-language state abstraction, cross-temporal-window simulation evaluation, and expert policy distillation. Experiments demonstrate that ReflecSched achieves a win rate of 71.35% and reduces relative deviation by 2.755%, consistently outperforming all baseline methods.
📝 Abstract
Dynamic Flexible Job-Shop Scheduling (DFJSP) is an NP-hard problem challenged by real-time event adaptation and complex machine routing. While traditional dispatching rules are efficient but rigid, deep learning approaches are opaque and require intricate feature engineering. Large Language Models (LLMs) promise adaptive reasoning without this engineering overhead, yet we find their direct application is suboptimal. Baseline LLMs suffer from three key pitfalls: the long-context paradox, where crucial data is underutilized; an underutilization of expert heuristics; and myopic decision-making. To address this, we propose ReflecSched, a framework that empowers the LLM beyond a direct scheduler by equipping it with a strategic analysis capability. ReflecSched tasks the LLM to analyze heuristic-driven simulations across multiple planning horizons and distill them into a concise, natural-language summary termed ``Strategic Experience''. This summary is then integrated into the prompt of a final decision-making module, guiding it to produce non-myopic actions. Experiments show that ReflecSched not only statistically significantly outperforms direct LLM baselines, securing a 71.35% Win Rate and a 2.755% Relative Percentage Deviation reduction, but also surpasses the performance of all individual heuristics evaluated, all while demonstrably mitigating the three identified pitfalls. Additionally, ReflecSched performs on par with the best heuristic tailored to each instance across all problem cases.