LATMOS: Latent Automaton Task Model from Observation Sequences

📅 2025-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In service robot task planning, the lack of coordination among task decomposition, state perception, and execution verification remains a critical challenge. Method: This paper proposes an observation-driven framework for learning implicit finite-state machines (FSMs), uniquely integrating automata theory with latent-space representation learning. A deep multimodal encoder (combining CNNs and Transformers) jointly models images, videos, language, and robot states to enable symbolic latent-space modeling and automatic FSM induction—without requiring labeled state transitions. Contribution/Results: The framework unifies task-structure discovery and execution assurance, yielding interpretable and formally verifiable implicit FSMs. Evaluated on logical tasks, human behavior videos, and real-world robot deployments, it achieves significantly higher planning success rates and verifiability than prior end-to-end and rule-based approaches, demonstrating superior generalization across diverse domains.

Technology Category

Application Category

📝 Abstract
Robot task planning from high-level instructions is an important step towards deploying fully autonomous robot systems in the service sector. Three key aspects of robot task planning present challenges yet to be resolved simultaneously, namely, (i) factorization of complex tasks specifications into simpler executable subtasks, (ii) understanding of the current task state from raw observations, and (iii) planning and verification of task executions. To address these challenges, we propose LATMOS, an automata-inspired task model that, given observations from correct task executions, is able to factorize the task, while supporting verification and planning operations. LATMOS combines an observation encoder to extract the features from potentially high-dimensional observations with automata theory to learn a sequential model that encapsulates an automaton with symbols in the latent feature space. We conduct extensive evaluations in three task model learning setups: (i) abstract tasks described by logical formulas, (ii) real-world human tasks described by videos and natural language prompts and (iii) a robot task described by image and state observations. The results demonstrate the improved plan generation and verification capabilities of LATMOS across observation modalities and tasks.
Problem

Research questions and friction points this paper is trying to address.

Factorize complex tasks into simpler subtasks
Understand task state from raw observations
Plan and verify task executions effectively
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automata-inspired task model for robot planning
Observation encoder extracts high-dimensional features
Supports task factorization, verification, and planning
🔎 Similar Papers
No similar papers found.
Q
Qiyue Dong
Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, CA 92093, USA
Eduardo Sebastián
Eduardo Sebastián
University of Cambridge
RoboticsNetworked SystemsControlLearning
N
Nikolay Atanasov
Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, CA 92093, USA