LLM-Pack: Intuitive Grocery Handling for Logistics Applications

📅 2025-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automated grocery packing for fresh food and perishables in retail logistics remains underexplored, particularly concerning damage mitigation when fragile and heavy items are co-packed. Method: We propose the first zero-shot, multimodal large language model–based framework for grocery packing strategy generation. Leveraging vision-language models (VLMs) for item recognition and semantic understanding, the framework employs hierarchical prompt engineering to emulate human packing logic—requiring no category-specific annotations or model retraining. Contribution/Results: Evaluated on a real-world supermarket product dataset, our approach significantly outperforms rule-based baselines in packing合理性 (structural soundness) and safety (damage prevention). The modular design enables plug-and-play integration of upgraded foundation models. To foster reproducibility and community advancement, the source code will be publicly released.

Technology Category

Application Category

📝 Abstract
Robotics and automation are increasingly influential in logistics but remain largely confined to traditional warehouses. In grocery retail, advancements such as cashier-less supermarkets exist, yet customers still manually pick and pack groceries. While there has been a substantial focus in robotics on the bin picking problem, the task of packing objects and groceries has remained largely untouched. However, packing grocery items in the right order is crucial for preventing product damage, e.g., heavy objects should not be placed on top of fragile ones. However, the exact criteria for the right packing order are hard to define, in particular given the huge variety of objects typically found in stores. In this paper, we introduce LLM-Pack, a novel approach for grocery packing. LLM-Pack leverages language and vision foundation models for identifying groceries and generating a packing sequence that mimics human packing strategy. LLM-Pack does not require dedicated training to handle new grocery items and its modularity allows easy upgrades of the underlying foundation models. We extensively evaluate our approach to demonstrate its performance. We will make the source code of LLMPack publicly available upon the publication of this manuscript.
Problem

Research questions and friction points this paper is trying to address.

Automating grocery packing to prevent product damage
Developing a system that mimics human packing strategies
Handling diverse grocery items without dedicated training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses language and vision models for grocery identification
Generates packing sequences mimicking human strategies
Modular design allows easy model upgrades
🔎 Similar Papers
No similar papers found.
Y
Yannik Blei
Department of Computer Science & Artificial Intelligence, University of Technology Nuremberg, Germany
M
Michael Krawez
Department of Computer Science & Artificial Intelligence, University of Technology Nuremberg, Germany
T
Tobias Julg
Department of Computer Science & Artificial Intelligence, University of Technology Nuremberg, Germany
P
Pierre Krack
Department of Computer Science & Artificial Intelligence, University of Technology Nuremberg, Germany
Florian Walter
Florian Walter
University of Technology Nuremberg, Machine Intelligence Lab
Machine IntelligenceRoboticsMachine LearningAICognitive Robotics
Wolfram Burgard
Wolfram Burgard
Professor of Computer Science, University of Technology Nuremberg
RoboticsArtificial IntelligenceAIMachine LearningComputer Vision