GABRIL: Gaze-Based Regularization for Mitigating Causal Confusion in Imitation Learning

📅 2025-07-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Causal confounding—where models erroneously interpret spurious correlations as causal relationships—is prevalent in imitation learning, leading to degraded generalization under distributional shift. To address this, we propose GABRIL, the first method to leverage human expert gaze data as a causal feature guidance signal. GABRIL incorporates a gaze-region regularization loss into representation learning to mitigate confounder interference, thereby enhancing both causal robustness and interpretability. Integrated within a supervised imitation learning framework, GABRIL is evaluated on Atari and CARLA benchmarks. It achieves 179% and 76% task performance gains over behavioral cloning, respectively, and significantly outperforms existing baselines. Our core contribution lies in pioneering the use of eye-tracking data for causal disentanglement in imitation learning—establishing a novel paradigm that jointly improves explainability and out-of-distribution generalization.

Technology Category

Application Category

📝 Abstract
Imitation Learning (IL) is a widely adopted approach which enables agents to learn from human expert demonstrations by framing the task as a supervised learning problem. However, IL often suffers from causal confusion, where agents misinterpret spurious correlations as causal relationships, leading to poor performance in testing environments with distribution shift. To address this issue, we introduce GAze-Based Regularization in Imitation Learning (GABRIL), a novel method that leverages the human gaze data gathered during the data collection phase to guide the representation learning in IL. GABRIL utilizes a regularization loss which encourages the model to focus on causally relevant features identified through expert gaze and consequently mitigates the effects of confounding variables. We validate our approach in Atari environments and the Bench2Drive benchmark in CARLA by collecting human gaze datasets and applying our method in both domains. Experimental results show that the improvement of GABRIL over behavior cloning is around 179% more than the same number for other baselines in the Atari and 76% in the CARLA setup. Finally, we show that our method provides extra explainability when compared to regular IL agents.
Problem

Research questions and friction points this paper is trying to address.

Mitigates causal confusion in imitation learning
Uses human gaze data to guide representation learning
Improves performance and explainability in IL agents
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses gaze data for regularization
Mitigates causal confusion in learning
Improves performance and explainability
🔎 Similar Papers
No similar papers found.
Amin Banayeeanzade
Amin Banayeeanzade
Graduate Research Assistant, University of Southern California
Artificial General IntelligenceMachine LearningContinual Learning
F
Fatemeh Bahrani
Thomas Lord Department of Computer Science, University of Southern California, USA
Y
Yutai Zhou
Thomas Lord Department of Computer Science, University of Southern California, USA
Erdem Bıyık
Erdem Bıyık
Assistant Professor, University of Southern California
RoboticsHuman-Robot InteractionMachine LearningArtificial Intelligence