Addressing Ambiguity in Imitation Learning through Product of Experts based Negative Feedback

📅 2026-03-27

📈 Citations: 0

✨ Influential: 0

career value

231K/year

🤖 AI Summary

In real-world scenarios such as household service, human demonstrations are often suboptimal and ambiguous, limiting the performance of imitation learning. To address this challenge, this work proposes the first imitation learning framework that integrates a Product of Experts model with negative feedback, enabling robots to effectively learn ambiguous tasks by leveraging both positive and negative demonstration data—including their own failures and non-optimal human examples. The proposed method significantly enhances policy robustness in handling task ambiguity, achieving a 90% improvement in task success rate in simulation and a 50% gain on real robots. Furthermore, it outperforms existing negative-feedback approaches in terms of memory efficiency, training speed, and scalability.

Technology Category

Application Category

📝 Abstract

Programming robots to perform complex tasks is often difficult and time consuming, requiring expert knowledge and skills in robot software and sometimes hardware. Imitation learning is a method for training robots to perform tasks by leveraging human expertise through demonstrations. Typically, the assumption is that those demonstrations are performed by a single, highly competent expert. However, in many real-world applications that use user demonstrations for tasks or incorporate both user data and pretrained data, such as home robotics including assistive robots, this is unlikely to be the case. This paper presents research towards a system which can leverage suboptimal demonstrations to solve ambiguous tasks; and particularly learn from its own failures. This is a negative-feedback system which achieves significant improvement over purely positive imitation learning for ambiguous tasks, achieving a 90% improvement in success rate against a system that does not utilise negative feedback, compared to a 50% improvement in success rate when utilised on a real robot, as well as demonstrating higher efficacy, memory efficiency and time efficiency than a comparable negative feedback scheme. The novel scheme presented in this paper is validated through simulated and real-robot experiments.

Problem

Research questions and friction points this paper is trying to address.

Imitation Learning

Ambiguity

Negative Feedback

Suboptimal Demonstrations

Robot Learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Imitation Learning

Negative Feedback

Product of Experts