Learning Fused State Representations for Control from Multi-View Observations

📅 2025-02-03

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

To address state representation degradation in multi-view reinforcement learning (MVRL) caused by view redundancy, interference, and missing views, this paper proposes a robust multi-view representation learning framework. Methodologically, it introduces bisimulation metric learning into MVRL for the first time to align task-relevant semantic states across views; designs a multi-view masked modeling and latent-space joint reconstruction auxiliary task to explicitly enhance generalization under view occlusion or absence; and develops a contrastive representation fusion mechanism to suppress irrelevant feature interference. Evaluated on standard benchmarks as well as challenging variants with adversarial interference or stochastic view dropout, the proposed method consistently outperforms existing MVRL approaches. It yields more stable control policies and learns representations that are significantly more compact and task-discriminative.

Technology Category

Application Category

📝 Abstract

Multi-View Reinforcement Learning (MVRL) seeks to provide agents with multi-view observations, enabling them to perceive environment with greater effectiveness and precision. Recent advancements in MVRL focus on extracting latent representations from multiview observations and leveraging them in control tasks. However, it is not straightforward to learn compact and task-relevant representations, particularly in the presence of redundancy, distracting information, or missing views. In this paper, we propose Multi-view Fusion State for Control (MFSC), firstly incorporating bisimulation metric learning into MVRL to learn task-relevant representations. Furthermore, we propose a multiview-based mask and latent reconstruction auxiliary task that exploits shared information across views and improves MFSC's robustness in missing views by introducing a mask token. Extensive experimental results demonstrate that our method outperforms existing approaches in MVRL tasks. Even in more realistic scenarios with interference or missing views, MFSC consistently maintains high performance.

Problem

Research questions and friction points this paper is trying to address.

Multi-view Reinforcement Learning

Feature Identification

Information Integration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-View Reinforcement Learning

Dual-Similarity Metric Learning

Robustness Enhancement

🔎 Similar Papers

C3T: Cross-modal Transfer Through Time for Human Action Recognition