ARMADA: Autonomous Online Failure Detection and Human Shared Control Empower Scalable Real-world Deployment and Adaptation

📅 2025-10-02

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Imitation learning suffers from poor scalability in real-world deployment due to scarce domain-specific data, inconsistent human demonstration quality, and the need for continuous human supervision. To address these challenges, this paper introduces FLOAT—a novel framework integrating autonomous online failure detection (achieving 95% average accuracy, >20% improvement over SOTA), low-intervention shared control, parallel multi-robot rollouts, and closed-loop post-training. Its key innovation lies in the first incorporation of a lightweight human-in-the-loop mechanism within a multi-iteration policy execution loop, enabling failure-driven adaptive data collection and policy refinement. Evaluated across four real-world robotic tasks, FLOAT achieves over a fourfold increase in task success rate and reduces human intervention frequency by more than half, significantly enhancing real-time adaptability and scalability for large-scale deployment.

Technology Category

Application Category

📝 Abstract

Imitation learning has shown promise in learning from large-scale real-world datasets. However, pretrained policies usually perform poorly without sufficient in-domain data. Besides, human-collected demonstrations entail substantial labour and tend to encompass mixed-quality data and redundant information. As a workaround, human-in-the-loop systems gather domain-specific data for policy post-training, and exploit closed-loop policy feedback to offer informative guidance, but usually require full-time human surveillance during policy rollout. In this work, we devise ARMADA, a multi-robot deployment and adaptation system with human-in-the-loop shared control, featuring an autonomous online failure detection method named FLOAT. Thanks to FLOAT, ARMADA enables paralleled policy rollout and requests human intervention only when necessary, significantly reducing reliance on human supervision. Hence, ARMADA enables efficient acquisition of in-domain data, and leads to more scalable deployment and faster adaptation to new scenarios. We evaluate the performance of ARMADA on four real-world tasks. FLOAT achieves nearly 95% accuracy on average, surpassing prior state-of-the-art failure detection approaches by over 20%. Besides, ARMADA manifests more than 4$ imes$ increase in success rate and greater than 2$ imes$ reduction in human intervention rate over multiple rounds of policy rollout and post-training, compared to previous human-in-the-loop learning methods.

Problem

Research questions and friction points this paper is trying to address.

Detects policy failures autonomously during robot deployment

Reduces human supervision needs through shared control

Enables scalable adaptation to new environments efficiently

Innovation

Methods, ideas, or system contributions that make the work stand out.

Autonomous online failure detection method FLOAT

Human shared control for scalable deployment

Parallel policy rollout with selective human intervention

🔎 Similar Papers

No similar papers found.

Field AI

Irvine, CA

Research Scientist Intern, Robotic Control Policy (PhD)