đ¤ AI Summary
To address state representation degradation in multi-view reinforcement learning (MVRL) caused by view redundancy, interference, and missing views, this paper proposes a robust multi-view representation learning framework. Methodologically, it introduces bisimulation metric learning into MVRL for the first time to align task-relevant semantic states across views; designs a multi-view masked modeling and latent-space joint reconstruction auxiliary task to explicitly enhance generalization under view occlusion or absence; and develops a contrastive representation fusion mechanism to suppress irrelevant feature interference. Evaluated on standard benchmarks as well as challenging variants with adversarial interference or stochastic view dropout, the proposed method consistently outperforms existing MVRL approaches. It yields more stable control policies and learns representations that are significantly more compact and task-discriminative.
đ Abstract
Multi-View Reinforcement Learning (MVRL) seeks to provide agents with multi-view observations, enabling them to perceive environment with greater effectiveness and precision. Recent advancements in MVRL focus on extracting latent representations from multiview observations and leveraging them in control tasks. However, it is not straightforward to learn compact and task-relevant representations, particularly in the presence of redundancy, distracting information, or missing views. In this paper, we propose Multi-view Fusion State for Control (MFSC), firstly incorporating bisimulation metric learning into MVRL to learn task-relevant representations. Furthermore, we propose a multiview-based mask and latent reconstruction auxiliary task that exploits shared information across views and improves MFSC's robustness in missing views by introducing a mask token. Extensive experimental results demonstrate that our method outperforms existing approaches in MVRL tasks. Even in more realistic scenarios with interference or missing views, MFSC consistently maintains high performance.