Rebalanced Multimodal Learning with Data-aware Unimodal Sampling

📅 2025-03-05

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

To address modality imbalance in multimodal learning caused by uneven single-modality data sampling, this paper proposes a data-aware dynamic unimodal sampling framework. Methodologically, it introduces (1) Cumulative Modality Difference (CMD), the first differentiable and monitorable metric for quantifying modality imbalance; (2) an adaptive sampling strategy jointly driven by heuristic scheduling and Proximal Policy Optimization (PPO)-based reinforcement learning, enabling end-to-end optimization of the sampling process; and (3) a plug-and-play module that seamlessly integrates with mainstream multimodal architectures. Evaluated on multiple benchmark datasets, the framework achieves an average accuracy improvement of 2.3% over state-of-the-art methods, demonstrating that regulating modality balance at the data sampling stage is critical to enhancing model performance.

Technology Category

Application Category

📝 Abstract

To address the modality learning degeneration caused by modality imbalance, existing multimodal learning~(MML) approaches primarily attempt to balance the optimization process of each modality from the perspective of model learning. However, almost all existing methods ignore the modality imbalance caused by unimodal data sampling, i.e., equal unimodal data sampling often results in discrepancies in informational content, leading to modality imbalance. Therefore, in this paper, we propose a novel MML approach called underline{D}ata-aware underline{U}nimodal underline{S}ampling~(method), which aims to dynamically alleviate the modality imbalance caused by sampling. Specifically, we first propose a novel cumulative modality discrepancy to monitor the multimodal learning process. Based on the learning status, we propose a heuristic and a reinforcement learning~(RL)-based data-aware unimodal sampling approaches to adaptively determine the quantity of sampled data at each iteration, thus alleviating the modality imbalance from the perspective of sampling. Meanwhile, our method can be seamlessly incorporated into almost all existing multimodal learning approaches as a plugin. Experiments demonstrate that method~can achieve the best performance by comparing with diverse state-of-the-art~(SOTA) baselines.

Problem

Research questions and friction points this paper is trying to address.

Addresses modality imbalance in multimodal learning

Proposes data-aware unimodal sampling to balance modalities

Enhances performance by integrating with existing MML methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic unimodal sampling to balance modalities

Cumulative modality discrepancy for learning monitoring

Reinforcement learning for adaptive data sampling

🔎 Similar Papers

No similar papers found.