VI-MMRec: Similarity-Aware Training Cost-free Virtual User-Item Interactions for Multimodal Recommendation

📅 2025-12-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multimodal recommendation suffers from severe data sparsity: sparse user–item interactions lead to numerous unobserved items being incorrectly treated as negative samples, degrading model performance. To address this, we propose a zero-training-overhead virtual user–item interaction augmentation framework—the first approach that is model-agnostic, requires no additional parameter learning, and generates modality-aware virtual interactions. Our core contributions are threefold: (1) constructing high-quality virtual interactions based on multimodal feature similarity; (2) introducing modality-specific similarity weighting to prioritize informative cross-modal signals; and (3) supporting two adaptive fusion strategies—Overlay and Synergistic—to flexibly integrate virtual interactions. Extensive experiments across six real-world datasets, integrated with seven state-of-the-art recommendation models, consistently yield significant performance gains. Results demonstrate the method’s effectiveness, strong generalizability across architectures and domains, and deployment friendliness due to its parameter-free, inference-only design.

Technology Category

Application Category

📝 Abstract
Although existing multimodal recommendation models have shown promising performance, their effectiveness continues to be limited by the pervasive data sparsity problem. This problem arises because users typically interact with only a small subset of available items, leading existing models to arbitrarily treat unobserved items as negative samples. To this end, we propose VI-MMRec, a model-agnostic and training cost-free framework that enriches sparse user-item interactions via similarity-aware virtual user-item interactions. These virtual interactions are constructed based on modality-specific feature similarities of user-interacted items. Specifically, VI-MMRec introduces two different strategies: (1) Overlay, which independently aggregates modality-specific similarities to preserve modality-specific user preferences, and (2) Synergistic, which holistically fuses cross-modal similarities to capture complementary user preferences. To ensure high-quality augmentation, we design a statistically informed weight allocation mechanism that adaptively assigns weights to virtual user-item interactions based on dataset-specific modality relevance. As a plug-and-play framework, VI-MMRec seamlessly integrates with existing models to enhance their performance without modifying their core architecture. Its flexibility allows it to be easily incorporated into various existing models, maximizing performance with minimal implementation effort. Moreover, VI-MMRec introduces no additional overhead during training, making it significantly advantageous for practical deployment. Comprehensive experiments conducted on six real-world datasets using seven state-of-the-art multimodal recommendation models validate the effectiveness of our VI-MMRec.
Problem

Research questions and friction points this paper is trying to address.

Addresses data sparsity in multimodal recommendation systems
Enriches sparse user-item interactions via similarity-aware virtual interactions
Enhances existing models without training overhead or architectural changes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates virtual interactions via modality-specific feature similarities
Uses overlay and synergistic strategies to capture user preferences
Integrates as plug-and-play framework without training overhead
🔎 Similar Papers
No similar papers found.