ILeSiA: Interactive Learning of Situational Awareness from Camera Input

📅 2024-09-30

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

Identifying safety risks during robotic skill execution in previously unseen environments remains challenging. Method: This paper proposes a situation-aware online risk learning framework that integrates imitation learning with sparse human annotations (safe/risky). A CNN- or ViT-based encoder extracts low-dimensional latent representations from camera images, and a lightweight binary classifier is trained for real-time risk discrimination. The framework uniquely unifies sparse interactive annotation, latent-space modeling, and an incremental label augmentation mechanism. Results: Experiments demonstrate that the method achieves high-accuracy, real-time risk detection across multiple tasks using only a small number of annotations. It significantly enhances the safety of robotic skill execution and improves generalization across diverse, unseen scenarios.

Technology Category

Application Category

📝 Abstract

Learning from demonstration is a promising way of teaching robots new skills. However, a central problem when executing acquired skills is to recognize risks and failures. This is essential since the demonstrations usually cover only a few mostly successful cases. Inevitable errors during execution require specific reactions that were not apparent in the demonstrations. In this paper, we focus on teaching the robot situational awareness from an initial skill demonstration via kinesthetic teaching and sparse labeling of autonomous skill executions as safe or risky. At runtime, our system, called ILeSiA, detects risks based on the perceived camera images by encoding the images into a low-dimensional latent space representation and training a classifier based on the encoding and the provided labels. In this way, ILeSiA boosts the confidence and safety with which robotic skills can be executed. Our experiments demonstrate that classifiers, trained with only a small amount of user-provided data, can successfully detect numerous risks. The system is flexible because the risk cases are defined by labeling data. This also means that labels can be added as soon as risks are identified by a human supervisor. We provide all code and data required to reproduce our experiments at imitrob.ciirc.cvut.cz/publications/ilesia.

Problem

Research questions and friction points this paper is trying to address.

Teaching robots situational awareness from camera input

Recognizing faults and preventing failures during task execution

Enabling rapid deployment of cobots with vision-based risk assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Process regression on latent image features

Continuous risk scoring from zero to one

Interactive learning with human-labeled frame data

🔎 Similar Papers

No similar papers found.