Look Before You Leap: Using Serialized State Machine for Language Conditioned Robotic Manipulation

📅 2025-03-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing imitation learning approaches for language-guided long-horizon robotic manipulation suffer from insufficient demonstration coverage, leading to action failures and error propagation. To address this, we propose a dynamic demonstration generation framework grounded in a serialized finite-state machine (FSM). Our method integrates language model grounding, FSM-based task modeling, and real-time environmental state feedback to enable proactive action planning and on-the-fly error suppression. The core innovation lies in explicitly encoding task logic as an evolvable and formally verifiable serialized FSM, which jointly orchestrates demonstration generation and policy fine-tuning. Evaluated on long-horizon environment-evolving puzzle tasks, our approach achieves a 98% success rate—substantially outperforming the best baseline (60%) and most alternatives (near 0%). To our knowledge, this is the first method to achieve highly robust, scalable mapping from natural language instructions to extended action sequences in robotic manipulation.

Technology Category

Application Category

📝 Abstract
Imitation learning frameworks for robotic manipulation have drawn attention in the recent development of language model grounded robotics. However, the success of the frameworks largely depends on the coverage of the demonstration cases: When the demonstration set does not include examples of how to act in all possible situations, the action may fail and can result in cascading errors. To solve this problem, we propose a framework that uses serialized Finite State Machine (FSM) to generate demonstrations and improve the success rate in manipulation tasks requiring a long sequence of precise interactions. To validate its effectiveness, we use environmentally evolving and long-horizon puzzles that require long sequential actions. Experimental results show that our approach achieves a success rate of up to 98 in these tasks, compared to the controlled condition using existing approaches, which only had a success rate of up to 60, and, in some tasks, almost failed completely.
Problem

Research questions and friction points this paper is trying to address.

Improves robotic manipulation success rates
Addresses limitations in imitation learning frameworks
Uses serialized Finite State Machine for precise interactions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Serialized Finite State Machine for precise robotic actions
Enhanced success rate in long sequential manipulation tasks
Validated with environmentally evolving, long-horizon puzzles
T
Tong Mu
Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA; The Institute for Integrative and Innovative Research, University of Arkansas, Fayetteville, AR 72701, USA
Y
Yihao Liu
Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA; The Institute for Integrative and Innovative Research, University of Arkansas, Fayetteville, AR 72701, USA
Mehran Armand
Mehran Armand
Professor, Mechanical Engineering, I3R, University of Arkansas
RoboticsMedical RoboticsImage-Guided Interventionsbiomechanics