Preference-Conditioned Reinforcement Learning for Space-Time Efficient Online 3D Bin Packing

📅 2026-03-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the fundamental trade-off in 3D bin packing between spatial efficiency and operational time, where maximizing packing density often conflicts with minimizing execution duration. To resolve this tension, we propose STEP, a reinforcement learning framework based on action selection that, for the first time, incorporates a preference-conditioning mechanism to explicitly model the spatial gain and temporal cost of candidate placement actions. This enables dynamic, time-aware decision-making that balances space utilization and execution efficiency. The approach generalizes across varying candidate set sizes and integrates seamlessly with existing placement modules to yield adaptive, time-sensitive policies. Experimental results demonstrate that STEP reduces operational time by 44% while maintaining packing density at baseline levels.

Technology Category

Application Category

📝 Abstract
Robotic bin packing is widely deployed in warehouse automation, with current systems achieving robust performance through heuristic and learning-based strategies. These systems must balance compact placement with rapid execution, where selecting alternative items or reorienting them can improve space utilization but introduce additional time. We propose a selection-based formulation that explicitly reasons over this trade-off: at each step, the robot evaluates multiple candidate actions, weighing expected packing benefit against estimated operational time. This enables time-aware strategies that selectively accept increased operational time when it yields meaningful spatial improvements. Our method, STEP (Space-Time Efficient Packing), uses a preference-conditioned, Transformer-based reinforcement learning policy, and allows generalization across candidate set sizes and integration with standard placement modules. It achieves a 44% reduction in operational time without compromising packing density. Additional material is available at https://step-packing.github.io.
Problem

Research questions and friction points this paper is trying to address.

3D bin packing
space-time trade-off
robotic automation
operational efficiency
online packing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Preference-Conditioned Reinforcement Learning
Space-Time Trade-off
Online 3D Bin Packing
Transformer-based Policy
Warehouse Automation
🔎 Similar Papers
No similar papers found.