🤖 AI Summary
Embodied agents frequently fail in complex tasks due to environmental dynamics, necessitating efficient, training-free recovery mechanisms. This paper proposes the Conditional Multi-Stage Recovery Framework (CMRF), introducing a novel four-stage error-handling pipeline—comprising three real-time execution-phase responses and one post-execution reflective stage. CMRF integrates zero-shot chain-of-thought prompting with environment feedback-driven conditional reasoning to enable context-aware, dynamic error correction and task recovery grounded in large language models (LLMs). Crucially, it requires no fine-tuning or additional training. Evaluated on the TEACH-TfD benchmark, CMRF achieves a 11.5% absolute improvement over the no-recovery baseline and outperforms the previous state-of-the-art by 19%, establishing a new SOTA. The framework significantly enhances robustness while preserving computational efficiency and deployment simplicity.
📝 Abstract
Embodied agents performing complex tasks are susceptible to execution failures, motivating the need for effective failure recovery mechanisms. In this work, we introduce a conditional multistage failure recovery framework that employs zero-shot chain prompting. The framework is structured into four error-handling stages, with three operating during task execution and one functioning as a post-execution reflection phase. Our approach utilises the reasoning capabilities of LLMs to analyse execution challenges within their environmental context and devise strategic solutions. We evaluate our method on the TfD benchmark of the TEACH dataset and achieve state-of-the-art performance, outperforming a baseline without error recovery by 11.5% and surpassing the strongest existing model by 19%.