🤖 AI Summary
This work addresses the challenge of optimizing scan strategies in laser additive manufacturing, where multi-objective coupling among thermal accumulation, residual stress, and distortion complicates decision-making. Conventional reinforcement learning approaches often suffer from optimization bias due to insufficient fidelity in reward functions and environment models. To overcome this, the authors propose a two-tier Proxy–FEA diagnostic framework: a lightweight thermally informed proxy model efficiently generates candidate scan policies at the lower tier, while sparse high-fidelity Abaqus finite element simulations at the upper tier provide reference labels to calibrate both reward design and world model. This approach uniquely integrates efficient proxy modeling with sparse FEA validation, revealing trade-offs among Mises stress, U3 displacement, and PEEQ plastic strain. Evaluated on the LDED32 benchmark, it identifies the center_out strategy as a robust compromise, demonstrating that sparse FEA is critical for mitigating proxy-induced misguidance and enhancing reinforcement learning reliability.
📝 Abstract
Reinforcement learning offers a promising approach for scan-order optimisation in laser additive manufacturing, where sequential scan decisions critically influence thermal accumulation, residual stress, distortion, and final part quality. A central challenge in applying RL to this domain lies in reward and world-model fidelity: full finite-element analysis is computationally prohibitive for dense in-the-loop evaluation, while cheap thermo-inspired proxy metrics, though efficient, may capture only partial aspects of the true thermo-mechanical objectives. This paper investigates a bilevel Proxy--FEA diagnostic framework for reward and world-model diagnosis in reinforcement-learning-guided scan-order optimisation. The lower level employs lightweight scan-path and thermo-inspired proxies for rapid candidate generation and preliminary policy-side screening, while the upper level utilises sparse Abaqus FEA simulations to provide simulation-based reference labels. The framework is examined on a simplified whole-track heating LDED32 stripe benchmark comprising ten representative scan strategies. Final-cooling residual Mises stress, U3 vertical distortion, and PEEQ plasticity metrics reveal an observed stress--distortion trade-off rather than a single monotonic quality objective. Within the evaluated set, the center_out strategy emerges as a robust compromise candidate, while raster_left_to_right and edge_in form opposing endpoints of the trade-off. Proxy--FEA alignment analysis shows that current cheap path-based metrics predominantly capture distortion-related (U3) behaviour and exhibit only weak correlation with the sparse FEA reference labels. These findings highlight that proxy-only reward designs risk misalignment in future RL training and underscore the value of sparse FEA reference signals for diagnostic-guided reward and world-model refinement prior to large-scale policy optimisation.