MInCo: Mitigating Information Conflicts in Distracted Visual Model-based Reinforcement Learning

📅 2025-04-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In model-based reinforcement learning (MBRL) with visual inputs, task-irrelevant background dynamics induce an information conflict between representation learning and latent state modeling, resulting in representational redundancy and poor policy robustness. This work—first from an information-theoretic perspective—identifies the root cause of this conflict and proposes a negative-sample-free contrastive learning framework coupled with a temporal adaptive loss reweighting mechanism to jointly optimize representation invariance and dynamics modeling accuracy. Furthermore, it integrates an information bottleneck constraint within a unified training paradigm that jointly optimizes observation reconstruction and latent dynamics prediction. Evaluated on robotic control tasks under diverse dynamic background disturbances, the method substantially outperforms state-of-the-art approaches: it learns compact, background-noise-robust representations, improves policy stability by 23.6%, and enhances sample efficiency by 18.4%.

Technology Category

Application Category

📝 Abstract
Existing visual model-based reinforcement learning (MBRL) algorithms with observation reconstruction often suffer from information conflicts, making it difficult to learn compact representations and hence result in less robust policies, especially in the presence of task-irrelevant visual distractions. In this paper, we first reveal that the information conflicts in current visual MBRL algorithms stem from visual representation learning and latent dynamics modeling with an information-theoretic perspective. Based on this finding, we present a new algorithm to resolve information conflicts for visual MBRL, named MInCo, which mitigates information conflicts by leveraging negative-free contrastive learning, aiding in learning invariant representation and robust policies despite noisy observations. To prevent the dominance of visual representation learning, we introduce time-varying reweighting to bias the learning towards dynamics modeling as training proceeds. We evaluate our method on several robotic control tasks with dynamic background distractions. Our experiments demonstrate that MInCo learns invariant representations against background noise and consistently outperforms current state-of-the-art visual MBRL methods. Code is available at https://github.com/ShiguangSun/minco.
Problem

Research questions and friction points this paper is trying to address.

Resolve information conflicts in visual MBRL algorithms
Learn invariant representations against task-irrelevant distractions
Improve robustness of policies in noisy visual environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Negative-free contrastive learning for invariant representation
Time-varying reweighting to balance learning phases
Mitigates information conflicts in visual MBRL
🔎 Similar Papers
No similar papers found.
S
Shiguang Sun
National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University
Hanbo Zhang
Hanbo Zhang
National University of Singapore
Interactive ManipulationHuman-Robot Interaction
Z
Zeyang Liu
National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University
X
Xinrui Yang
National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University
Lipeng Wan
Lipeng Wan
Georgia State University
Scientific Data ManagementHPCData-Intensive ComputingStorage and I/OSystem Resilience
Bing Yan
Bing Yan
Rochester Institute of Technology
power system optimizationgrid integration of renewables (wind and solar)operation optimization o
X
Xingyu Chen
National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University
X
Xuguang Lan
National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University