ATLAS: An Annotation Tool for Long-horizon Robotic Action Segmentation

📅 2026-04-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

201K/year
🤖 AI Summary
Existing annotation tools struggle to efficiently and accurately delineate action boundaries in long-horizon robotic manipulation tasks and lack native support for synchronizing multimodal data such as video and proprioceptive signals. To address this, this work proposes ATLAS—a multimodal, synchronized annotation tool tailored for robot action segmentation. ATLAS introduces a modular data abstraction layer that natively supports widely used formats including ROS bags and RLDS, enabling temporally aligned visualization of multi-view video streams and proprioceptive signals. Its keyboard-driven interaction design substantially enhances annotation efficiency. Experimental results on contact-rich assembly tasks demonstrate that ATLAS reduces the average per-action annotation time by at least 6% compared to ELAN, while leveraging temporal signals to improve boundary alignment accuracy by over 2.8% and reduce boundary error to one-fifth of prior levels.
📝 Abstract
Annotating long-horizon robotic demonstrations with precise temporal action boundaries is crucial for training and evaluating action segmentation and manipulation policy learning methods. Existing annotation tools, however, are often limited: they are designed primarily for vision-only data, do not natively support synchronized visualization of robot-specific time-series signals (e.g., gripper state or force/torque), or require substantial effort to adapt to different dataset formats. In this paper, we introduce ATLAS, an annotation tool tailored for long-horizon robotic action segmentation. ATLAS provides time-synchronized visualization of multi-modal robotic data, including multi-view video and proprioceptive signals, and supports annotation of action boundaries, action labels, and task outcomes. The tool natively handles widely used robotics dataset formats such as ROS bags and the Reinforcement Learning Dataset (RLDS) format, and provides direct support for specific datasets such as REASSEMBLE. ATLAS can be easily extended to new formats via a modular dataset abstraction layer. Its keyboard-centric interface minimizes annotation effort and improves efficiency. In experiments on a contact-rich assembly task, ATLAS reduced the average per-action annotation time by at least 6% compared to ELAN, while the inclusion of time-series data improved temporal alignment with expert annotations by more than 2.8% and decreased boundary error fivefold compared to vision-only annotation tools.
Problem

Research questions and friction points this paper is trying to address.

long-horizon robotic action segmentation
temporal action boundaries
multi-modal robotic data
annotation tool
time-synchronized visualization
Innovation

Methods, ideas, or system contributions that make the work stand out.

action segmentation
robotic annotation tool
multi-modal data visualization
time-series signals
dataset interoperability
🔎 Similar Papers
2024-08-02IEEE Robotics and Automation LettersCitations: 2