Adaptive Redundancy Regulation for Balanced Multimodal Information Refinement

📅 2025-11-13

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

In multimodal joint training, dominant modalities overpower backpropagation, causing optimization imbalance: (i) weakening coupling between late-stage representations and outputs while accumulating redundant information; and (ii) existing gradient regulation methods neglect inter-modal semantic correlations and directional dependencies. To address this, we propose Adaptive Redundancy Control (ARC), a semantic-aware gradient regulation framework. ARC introduces a redundancy-phase monitoring mechanism grounded in the information bottleneck principle and employs a co-information gating module to dynamically assess cross-modal semantic contributions. Crucially, it applies orthogonal gradient suppression *only* to the dominant modality when redundancy exceeds a threshold—preserving unimodal discriminative signals without uniform scaling. Its core innovation lies in directionally constrained, semantics-preserving gradient modulation. ARC achieves significant improvements over state-of-the-art methods across multiple benchmarks; ablation studies validate the efficacy of each component; and the code is publicly available.

Technology Category

Application Category

📝 Abstract

Multimodal learning aims to improve performance by leveraging data from multiple sources. During joint multimodal training, due to modality bias, the advantaged modality often dominates backpropagation, leading to imbalanced optimization. Existing methods still face two problems: First, the long-term dominance of the dominant modality weakens representation-output coupling in the late stages of training, resulting in the accumulation of redundant information. Second, previous methods often directly and uniformly adjust the gradients of the advantaged modality, ignoring the semantics and directionality between modalities. To address these limitations, we propose Adaptive Redundancy Regulation for Balanced Multimodal Information Refinement (RedReg), which is inspired by information bottleneck principle. Specifically, we construct a redundancy phase monitor that uses a joint criterion of effective gain growth rate and redundancy to trigger intervention only when redundancy is high. Furthermore, we design a co-information gating mechanism to estimate the contribution of the current dominant modality based on cross-modal semantics. When the task primarily relies on a single modality, the suppression term is automatically disabled to preserve modality-specific information. Finally, we project the gradient of the dominant modality onto the orthogonal complement of the joint multimodal gradient subspace and suppress the gradient according to redundancy. Experiments show that our method demonstrates superiority among current major methods in most scenarios. Ablation experiments verify the effectiveness of our method. The code is available at https://github.com/xia-zhe/RedReg.git

Problem

Research questions and friction points this paper is trying to address.

Addresses modality bias causing imbalanced optimization in multimodal learning

Reduces redundant information accumulation from dominant modality during training

Adaptively regulates gradients considering cross-modal semantics and redundancy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Redundancy phase monitor triggers adaptive intervention

Co-information gating preserves modality-specific information

Orthogonal gradient projection suppresses redundant information

🔎 Similar Papers

Babel: A Scalable Pre-trained Model for Multi-Modal Sensing via Expandable Modality Alignment