🤖 AI Summary
To address the structural instability and unsafe re-packing limitations in Online 3D Bin Packing Problems (OBPP), this paper proposes a novel framework integrating deep reinforcement learning (DRL) with geometric stability modeling. Our method introduces two key innovations: (1) the Load-Bearing Convex Polygon (LBCP) model, enabling efficient and verifiable real-time stability assessment; and (2) the Stable Re-packing Planning (SRP) module, which generates low-disturbance, high-volume-utilization dynamic re-packing plans while guaranteeing structural safety. By synergistically combining geometric analysis, heuristic search, and policy networks, our approach significantly outperforms existing DRL and heuristic methods on standard OBPP benchmarks: LBCP accelerates stability verification by 3.2×; SRP reduces re-packing cost by 47% and improves volumetric utilization by 8.6%; and the framework ensures industrial-grade robustness and practical deployability.
📝 Abstract
The Online Bin Packing Problem (OBPP) is a sequential decision-making task in which each item must be placed immediately upon arrival, with no knowledge of future arrivals. Although recent deep-reinforcement-learning methods achieve superior volume utilization compared with classical heuristics, the learned policies cannot ensure the structural stability of the bin and lack mechanisms for safely reconfiguring the bin when a new item cannot be placed directly. In this work, we propose a novel framework that integrates packing policy with structural stability validation and heuristic planning to overcome these limitations. Specifically, we introduce the concept of Load Bearable Convex Polygon (LBCP), which provides a computationally efficient way to identify stable loading positions that guarantee no bin collapse. Additionally, we present Stable Rearrangement Planning (SRP), a module that rearranges existing items to accommodate new ones while maintaining overall stability. Extensive experiments on standard OBPP benchmarks demonstrate the efficiency and generalizability of our LBCP-based stability validation, as well as the superiority of SRP in finding the effort-saving rearrangement plans. Our method offers a robust and practical solution for automated packing in real-world industrial and logistics applications.