Unified Learning of Temporal Task Structure and Action Timing for Bimanual Robot Manipulation

📅 2026-03-06

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This work addresses the misalignment between high-level task logic and low-level action timing in dual-arm robotic manipulation by proposing a joint learning approach that integrates symbolic and sub-symbolic temporal constraints. Leveraging human demonstrations, the method constructs executable action plans that unify task structure with precise timing: it introduces a three-dimensional action timing representation, employs the DPLL algorithm to infer consistent Allen interval-based symbolic temporal relations, and models sub-symbolic timing distributions using multivariate Gaussian mixture models. These components are integrated within a unified optimization framework to generate feasible execution plans. Experimental results demonstrate that the resulting parameterized temporal plans significantly outperform typical demonstration-based baselines and exhibit closer alignment with human-provided demonstrations.

Technology Category

Application Category

📝 Abstract

Temporal task structure is fundamental for bimanual manipulation: a robot must not only know that one action precedes or overlaps another, but also when each action should occur and how long it should take. While symbolic temporal relations enable high-level reasoning about task structure and alternative execution sequences, concrete timing parameters are equally essential for coordinating two hands at the execution level. Existing approaches address these two levels in isolation, leaving a gap between high-level task planning and low-level movement synchronization. This work presents an approach for learning both symbolic and subsymbolic temporal task constraints from human demonstrations and deriving executable, temporally parametrized plans for bimanual manipulation. Our contributions are (i) a 3-dimensional representation of timings between two actions with methods based on multivariate Gaussian Mixture Models to represent temporal relationships between actions on a subsymbolic level, (ii) a method based on the Davis-Putnam-Logemann-Loveland (DPLL) algorithm that finds and ranks all contradiction-free assignments of Allen relations to action pairs, representing different modes of a task, and (iii) an optimization-based planning system that combines the identified symbolic and subsymbolic temporal task constraints to derive temporally parametrized plans for robot execution. We evaluate our approach on several datasets, demonstrating that our method generates temporally parametrized plans closer to human demonstrations than the most characteristic demonstration baseline.

Problem

Research questions and friction points this paper is trying to address.

bimanual manipulation

temporal task structure

action timing

symbolic reasoning

movement synchronization

Innovation

Methods, ideas, or system contributions that make the work stand out.

temporal task structure

bimanual manipulation

Gaussian Mixture Models