AOEPT: Breaking the Implicit Modality-Reduction Bottleneck in Modality-Missing Prompt Tuning

📅 2026-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of partial modality missingness in real-world deployments of multimodal systems, where existing prompt tuning methods struggle to effectively leverage latent information from absent modalities due to implicit modality reduction. To overcome this limitation, we propose AOEPT, which introduces Modality-Contextualized Prompts (MCPs)—a novel mechanism that distills global modality priors from training data via lightweight prompt tuning and dynamically generates instance-aware prompts conditioned on available modalities. This approach explicitly models and injects missing modality information, thereby expanding the constrained inference space inherent in methods relying solely on observed modalities. AOEPT achieves efficient adaptation across diverse multimodal Transformer backbones, delivering significant performance gains on multiple benchmarks while maintaining minimal computational overhead.
📝 Abstract
Deploying multimodal systems in real-world environments often entails handling modality-missing scenarios, where one or more modalities are unavailable. While recent studies address this challenge for the general Multimodal Transformer (MT) architecture via prompt tuning, we identify a fundamental limitation in these methods: the Implicit Modality-Reduction bottleneck. By conditioning prompts solely on the observed modalities, they inadvertently restrict the reasoning scope of MTs to the modality-reduced subspace, cutting off access to the latent information sources of the missing modalities. To overcome this limitation, we propose AOEPT, which pioneers a novel modal-contextualized prompting fashion. Specifically, we introduce lightweight Modal-Contextualized Prompts (MCPs) that distill global modality-wise priors from training data, serving as latent repositories of the information sources for missing modalities. Conditioned on the remaining modalities, these MCPs are instantiated into instance-aware prompts that selectively augment missing-modality information for each sample, thereby restoring the reasoning scope of MTs beyond the observed-modality-only subspace. Experiments across various multimodal benchmarks and backbones confirm the strong performance of AOEPT, with minimal computational overhead.
Problem

Research questions and friction points this paper is trying to address.

modality-missing
prompt tuning
multimodal learning
implicit modality-reduction bottleneck
missing modality
Innovation

Methods, ideas, or system contributions that make the work stand out.

modality-missing
prompt tuning
modal-contextualized prompts
multimodal transformer
implicit modality-reduction bottleneck
🔎 Similar Papers
No similar papers found.