Multimodal Datasets with Controllable Mutual Information

📅 2025-10-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing multimodal benchmarks lack ground-truth mutual information (MI), hindering systematic evaluation of MI estimators and self-supervised multimodal learning methods. Method: We propose a novel paradigm integrating invertible flow-based generative models with structured causal frameworks, explicitly modeling causal dependencies among latent variables to enable precise, analytical MI specification and computation between modalities. Contribution/Results: This is the first approach to achieve explicit MI controllability, theoretical interpretability, and photorealistic data generation simultaneously in multimodal settings. We construct multiple realistic image–text and audio–text datasets with rigorously controlled, known MI levels spanning low-to-high correlation. Empirical evaluation demonstrates high sensitivity and strong reproducibility across state-of-the-art MI estimators and self-supervised models. Our benchmark provides the first theoretically grounded, MI-grounded platform for rigorous evaluation and advancement of multimodal representation learning.

Technology Category

Application Category

📝 Abstract
We introduce a framework for generating highly multimodal datasets with explicitly calculable mutual information between modalities. This enables the construction of benchmark datasets that provide a novel testbed for systematic studies of mutual information estimators and multimodal self-supervised learning techniques. Our framework constructs realistic datasets with known mutual information using a flow-based generative model and a structured causal framework for generating correlated latent variables.
Problem

Research questions and friction points this paper is trying to address.

Generating multimodal datasets with calculable mutual information
Creating benchmark datasets for mutual information estimators
Constructing realistic datasets using flow-based generative models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates multimodal datasets with calculable mutual information
Uses flow-based generative model for realistic data
Applies structured causal framework for correlated variables
🔎 Similar Papers
No similar papers found.
R
R. K. Hashmani
University of Wisconsin–Madison
G
Garrett W. Merz
University of Wisconsin–Madison
H
Helen Qu
Flatiron Institute
Mariel Pettee
Mariel Pettee
University of Wisconsin-Madison
Machine LearningHigh-Energy Particle PhysicsAstrophysics
K
K. Cranmer
University of Wisconsin–Madison