ARMADA: Autonomous Online Failure Detection and Human Shared Control Empower Scalable Real-world Deployment and Adaptation

πŸ“… 2025-10-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Imitation learning suffers from poor scalability in real-world deployment due to scarce domain-specific data, inconsistent human demonstration quality, and the need for continuous human supervision. To address these challenges, this paper introduces FLOATβ€”a novel framework integrating autonomous online failure detection (achieving 95% average accuracy, >20% improvement over SOTA), low-intervention shared control, parallel multi-robot rollouts, and closed-loop post-training. Its key innovation lies in the first incorporation of a lightweight human-in-the-loop mechanism within a multi-iteration policy execution loop, enabling failure-driven adaptive data collection and policy refinement. Evaluated across four real-world robotic tasks, FLOAT achieves over a fourfold increase in task success rate and reduces human intervention frequency by more than half, significantly enhancing real-time adaptability and scalability for large-scale deployment.

Technology Category

Application Category

πŸ“ Abstract
Imitation learning has shown promise in learning from large-scale real-world datasets. However, pretrained policies usually perform poorly without sufficient in-domain data. Besides, human-collected demonstrations entail substantial labour and tend to encompass mixed-quality data and redundant information. As a workaround, human-in-the-loop systems gather domain-specific data for policy post-training, and exploit closed-loop policy feedback to offer informative guidance, but usually require full-time human surveillance during policy rollout. In this work, we devise ARMADA, a multi-robot deployment and adaptation system with human-in-the-loop shared control, featuring an autonomous online failure detection method named FLOAT. Thanks to FLOAT, ARMADA enables paralleled policy rollout and requests human intervention only when necessary, significantly reducing reliance on human supervision. Hence, ARMADA enables efficient acquisition of in-domain data, and leads to more scalable deployment and faster adaptation to new scenarios. We evaluate the performance of ARMADA on four real-world tasks. FLOAT achieves nearly 95% accuracy on average, surpassing prior state-of-the-art failure detection approaches by over 20%. Besides, ARMADA manifests more than 4$ imes$ increase in success rate and greater than 2$ imes$ reduction in human intervention rate over multiple rounds of policy rollout and post-training, compared to previous human-in-the-loop learning methods.
Problem

Research questions and friction points this paper is trying to address.

Detects policy failures autonomously during robot deployment
Reduces human supervision needs through shared control
Enables scalable adaptation to new environments efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Autonomous online failure detection method FLOAT
Human shared control for scalable deployment
Parallel policy rollout with selective human intervention
πŸ”Ž Similar Papers
No similar papers found.