🤖 AI Summary
This work addresses the robustness and risk assessment of multimodal large language models (MLLMs) under distributional shift (out-of-distribution, OOD). We propose the first information-theoretic framework for MLLMs, introducing and deriving the *Effective Mutual Information* (EMI) as a principled metric. We establish the first tight upper bound on the EMI gap between in-distribution (ID) and OOD settings, explicitly linking cross-modal risk to visual–textual distributional discrepancy. Our method integrates information-theoretic analysis, Wasserstein distance-based distribution modeling, and rigorous derivation of cross-modal risk bounds. Empirically validated across 61 realistic OOD scenarios, our approach significantly improves both risk interpretability and predictive accuracy. The theoretical guarantees enable verifiable safety assessment, advancing the reliable deployment of MLLMs in open-world settings.
📝 Abstract
Multimodal large language models (MLLMs) have shown promising capabilities but struggle under distribution shifts, where evaluation data differ from instruction tuning distributions. Although previous works have provided empirical evaluations, we argue that establishing a formal framework that can characterize and quantify the risk of MLLMs is necessary to ensure the safe and reliable application of MLLMs in the real world. By taking an information-theoretic perspective, we propose the first theoretical framework that enables the quantification of the maximum risk of MLLMs under distribution shifts. Central to our framework is the introduction of Effective Mutual Information (EMI), a principled metric that quantifies the relevance between input queries and model responses. We derive an upper bound for the EMI difference between in-distribution (ID) and out-of-distribution (OOD) data, connecting it to visual and textual distributional discrepancies. Extensive experiments on real benchmark datasets, spanning 61 shift scenarios empirically validate our theoretical insights.