LMPVC and Policy Bank: Adaptive voice control for industrial robots with code generating LLMs and reusable Pythonic policies

📅 2025-06-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the need for natural-language interaction and rapid task programming in human–robot collaboration within complex manufacturing environments, this paper proposes LMPVC: a large language model (LLM)-based speech-controlled architecture enabling end-to-end voice-driven programming for ROS2 robots. Methodologically, LMPVC integrates automatic speech recognition, LLM-powered semantic parsing, and a Pythonic policy bank—a modular, reusable repository of domain-specific code snippets—that enables zero-shot cross-task adaptation without retraining the LLM. The policy bank explicitly encodes industrial knowledge to compensate for LLMs’ limited generalization in real-world robotic applications. Our key contributions are: (1) the first deep integration of LLM-based code generation with ROS2, yielding a speech-control framework that supports pedagogical instruction, interpretable policies, and seamless system compatibility; and (2) open-sourcing the complete implementation. Experiments demonstrate LMPVC’s high adaptability and robustness across diverse manipulation tasks.

Technology Category

Application Category

📝 Abstract
Modern industry is increasingly moving away from mass manufacturing, towards more specialized and personalized products. As manufacturing tasks become more complex, full automation is not always an option, human involvement may be required. This has increased the need for advanced human robot collaboration (HRC), and with it, improved methods for interaction, such as voice control. Recent advances in natural language processing, driven by artificial intelligence (AI), have the potential to answer this demand. Large language models (LLMs) have rapidly developed very impressive general reasoning capabilities, and many methods of applying this to robotics have been proposed, including through the use of code generation. This paper presents Language Model Program Voice Control (LMPVC), an LLM-based prototype voice control architecture with integrated policy programming and teaching capabilities, built for use with Robot Operating System 2 (ROS2) compatible robots. The architecture builds on prior works using code generation for voice control by implementing an additional programming and teaching system, the Policy Bank. We find this system can compensate for the limitations of the underlying LLM, and allow LMPVC to adapt to different downstream tasks without a slow and costly training process. The architecture and additional results are released on GitHub (https://github.com/ozzyuni/LMPVC).
Problem

Research questions and friction points this paper is trying to address.

Adaptive voice control for industrial robots using LLMs
Enhancing human-robot collaboration with reusable Pythonic policies
Overcoming LLM limitations via Policy Bank for task adaptability
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based voice control for ROS2 robots
Policy Bank for reusable Pythonic policies
Code generation for adaptive task adaptation
🔎 Similar Papers
No similar papers found.
O
Ossi Parikka
Cognitive Robotics group, Unit of Automation Technology and Mechanical Engineering, Tampere University, 33720, Tampere, Finland
Roel Pieters
Roel Pieters
Professor in Cognitive Robotics, Tampere University
Roboticshuman-robot interactioncognition