Solving the Paint Shop Problem with Flexible Management of Multi-Lane Buffers Using Reinforcement Learning and Action Masking

📅 2025-04-03

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

In automotive paint shops, frequent color changes caused by unstructured vehicle sequencing significantly increase operational costs and waste. Method: This paper addresses vehicle resequencing optimization under multi-lane FIFO buffers. We propose a deep reinforcement learning framework based on Proximal Policy Optimization (PPO), and—crucially—first prove the optimality of the greedy retrieval policy under fully flexible buffer configurations. Leveraging this theoretical insight, we design an action-masking mechanism integrated into the training process. Our approach synergistically combines combinatorial optimization modeling with large-scale stochastic experiments (170 instances, 2–8 lanes, 5–15 colors). Contribution/Results: The method substantially reduces color change counts, with performance improving as problem scale increases. It exhibits strong robustness to variations in buffer capacity and imbalanced color distributions, overcoming key limitations of conventional heuristics and simplified modeling approaches.

Technology Category

Application Category

📝 Abstract

In the paint shop problem, an unordered incoming sequence of cars assigned to different colors has to be reshuffled with the objective of minimizing the number of color changes. To reshuffle the incoming sequence, manufacturers can employ a first-in-first-out multi-lane buffer system allowing store and retrieve operations. So far, prior studies primarily focused on simple decision heuristics like greedy or simplified problem variants that do not allow full flexibility when performing store and retrieve operations. In this study, we propose a reinforcement learning approach to minimize color changes for the flexible problem variant, where store and retrieve operations can be performed in an arbitrary order. After proving that greedy retrieval is optimal, we incorporate this finding into the model using action masking. Our evaluation, based on 170 problem instances with 2-8 buffer lanes and 5-15 colors, shows that our approach reduces color changes compared to existing methods by considerable margins depending on the problem size. Furthermore, we demonstrate the robustness of our approach towards different buffer sizes and imbalanced color distributions.

Problem

Research questions and friction points this paper is trying to address.

Minimize color changes in car paint sequencing

Optimize multi-lane buffer store/retrieve operations

Apply reinforcement learning with action masking

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning for paint shop optimization

Action masking to enforce optimal retrieval

Flexible multi-lane buffer management system

🔎 Similar Papers

No similar papers found.