DeLo: Dual Decomposed Low-Rank Experts Collaboration for Continual Missing Modality Learning

📅 2026-03-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of modality missingness, task interference, and catastrophic forgetting faced by large-scale multimodal models in continuous data streams. To this end, we propose Dual-Decomposed LoRA Experts (DD-LoRA), a novel architecture that dynamically constructs LoRA update matrices through a decoupled pool of modality-specific factors. By integrating a task-partitioning framework, cross-modal guided routing, and a task-key memory mechanism, our approach enables efficient and stable continual learning. As the first to introduce dual-decomposed low-rank structures into continual learning under missing modalities, DD-LoRA significantly mitigates interference between modalities and tasks, supports task-agnostic inference, and effectively prevents forgetting. Extensive experiments on mainstream CMML benchmarks demonstrate substantial performance gains over current state-of-the-art methods, validating the efficacy of architecture-aware LoRA design in real-world multimodal scenarios.

Technology Category

Application Category

📝 Abstract
Adapting Large Multimodal Models (LMMs) to real-world scenarios poses the dual challenges of learning from sequential data streams while handling frequent modality incompleteness, a task known as Continual Missing Modality Learning (CMML). However, existing works on CMML have predominantly relied on prompt tuning, a technique that struggles with this task due to cross-task interference between its learnable prompts in their shared embedding space. A naive application of Low-Rank Adaptation (LoRA) with modality-shared module will also suffer modality interference from competing gradients. To this end, we propose DeLo, the first framework to leverage a novel dual-decomposed low-rank expert architecture for CMML. Specifically, this architecture resolves modality interference through decomposed LoRA expert, dynamically composing LoRA update matrix with rank-one factors from disentangled modality-specific factor pools. Embedded within a task-partitioned framework that structurally prevents catastrophic forgetting, this expert system is supported by two key mechanisms: a Cross-Modal Guided Routing strategy to handle incomplete data and a Task-Key Memory for efficient, task-agnostic inference. Extensive experiments on established CMML benchmarks demonstrate that our method significantly outperforms state-of-the-art approaches. This highlights the value of a principled, architecturally-aware LoRA design for real-world multimodal challenges.
Problem

Research questions and friction points this paper is trying to address.

Continual Missing Modality Learning
Multimodal Learning
Modality Incompleteness
Continual Learning
Large Multimodal Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-Rank Adaptation
Continual Missing Modality Learning
Modality Interference
Expert Architecture
Cross-Modal Routing
🔎 Similar Papers
No similar papers found.