A Factory-Floor Deployment Case Study of VLA Pipelines for Industrial Packaging Task: Workflow, Failures, and Lessons

📅 2026-05-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited reliability and poor adaptability of vision–language–action (VLA) policies when deployed in industrial packaging tasks. To bridge this gap, we implement a deployment-driven, closed-loop optimization workflow on a real Siemens production line, building upon the pretrained Pi0.5 model. Our approach integrates on-site data collection and cleaning, targeted acquisition of failure-recovery episodes, and iterative fine-tuning with rigorous evaluation. This pipeline enables the first systematic documentation and analysis of VLA failure modes in authentic factory settings. Over the course of the project, we collected 2,535 task segments (approximately 10 hours of interaction), which substantially improved policy robustness and success rates in critical operations such as grasping transparent objects and performing precise placement.
📝 Abstract
Vision-Language-Action (VLA) policies have shown promising manipulation capabilities, yet their practical impact is often limited by the reliability demands of real-world deployment. We present a deployment study of an industrial packaging task at Siemens Factory (GWE, Erlangen, Germany), where a robot must pick a transparent accessory bag from a cluttered pile, insert it into the remaining cavity of a cardboard package, and ensure that the bag and its contents remain below the closing plane. Our goal is to understand the practical effort required to adapt a pretrained Pi0.5 policy to a single factory-floor task through iterative fine-tuning and deployment-driven refinement. The pipeline consists of repeated loops of data collection, curation, fine-tuning, evaluation, and targeted recovery data collection. We have accumulated 2535 episodes (10 hours) from the on-site factory settings. In this paper, we contribute an empirical account of a factory-floor VLA deployment, highlighting recurring failure modes and lessons that inform how to improve the deployment workflow.
Problem

Research questions and friction points this paper is trying to address.

Vision-Language-Action
industrial deployment
robotic manipulation
factory-floor automation
reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-Language-Action (VLA)
factory-floor deployment
iterative fine-tuning
failure mode analysis
industrial robotics
🔎 Similar Papers