Expressive and Scalable Quantum Fusion for Multimodal Learning

📅 2025-10-08

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

To address the parameter explosion problem in modeling high-order feature interactions in multimodal learning, this paper proposes the Quantum Fusion Layer (QFL), a hybrid quantum-classical differentiable fusion mechanism. QFL employs parameterized quantum circuits to encode cross-modal entanglement relationships—marking the first integration of quantum signal processing into multimodal fusion. We theoretically establish its quantum query advantage over low-rank tensor methods, enabling high-order interaction modeling with linear parameter growth. By jointly training quantum state encoding and variational quantum algorithms, QFL supports end-to-end optimization. Experiments demonstrate that QFL significantly outperforms classical baselines on few-shot multimodal tasks, particularly exhibiting strong generalization under high modality counts. These results validate both the scalability and effectiveness of quantum-enhanced multimodal fusion.

Technology Category

Application Category

📝 Abstract

The aim of this paper is to introduce a quantum fusion mechanism for multimodal learning and to establish its theoretical and empirical potential. The proposed method, called the Quantum Fusion Layer (QFL), replaces classical fusion schemes with a hybrid quantum-classical procedure that uses parameterized quantum circuits to learn entangled feature interactions without requiring exponential parameter growth. Supported by quantum signal processing principles, the quantum component efficiently represents high-order polynomial interactions across modalities with linear parameter scaling, and we provide a separation example between QFL and low-rank tensor-based methods that highlights potential quantum query advantages. In simulation, QFL consistently outperforms strong classical baselines on small but diverse multimodal tasks, with particularly marked improvements in high-modality regimes. These results suggest that QFL offers a fundamentally new and scalable approach to multimodal fusion that merits deeper exploration on larger systems.

Problem

Research questions and friction points this paper is trying to address.

Develops quantum fusion mechanism for multimodal learning

Replaces classical fusion with hybrid quantum-classical procedure

Efficiently models entangled feature interactions without exponential parameters

Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantum Fusion Layer replaces classical fusion schemes

Parameterized quantum circuits enable entangled feature interactions

Efficient high-order polynomial interactions with linear scaling

🔎 Similar Papers

Multimodal Lego: Model Merging and Fine-Tuning Across Topologies and Modalities in Biomedicine