π€ AI Summary
Existing recommender systems address cold-start (for new users/items), cross-domain transfer, and multimodal fusion in isolation, limiting their effectiveness in complex, real-world consumption scenarios. To overcome this fragmentation, we propose MICRecβthe first unified framework integrating inductive learning, cross-domain collaboration, and multimodal fusion. MICRec leverages overlapping users as cross-domain anchors and builds upon the INMO inductive backbone. It introduces a modality-aware representation aggregation mechanism, jointly coupling cross-domain knowledge transfer with multimodal content encoding to enable fine-grained modeling of user context and latent preferences under sparse, heterogeneous data conditions. Extensive experiments on multiple real-world datasets demonstrate that MICRec significantly outperforms 12 state-of-the-art baselines, particularly excelling in low-resource domains. These results validate its robustness, generalizability, and practical applicability.
π Abstract
Recommender systems have long been built upon the modeling of interactions between users and items, while recent studies have sought to broaden this paradigm by generalizing to new users and items, incorporating diverse information sources, and transferring knowledge across domains. Nevertheless, these efforts have largely focused on individual aspects, hindering their ability to tackle the complex recommendation scenarios that arise in daily consumptions across diverse domains. In this paper, we present MICRec, a unified framework that fuses inductive modeling, multimodal guidance, and cross-domain transfer to capture user contexts and latent preferences in heterogeneous and incomplete real-world data. Moving beyond the inductive backbone of INMO, our model refines expressive representations through modality-based aggregation and alleviates data sparsity by leveraging overlapping users as anchors across domains, thereby enabling robust and generalizable recommendation. Experiments show that MICRec outperforms 12 baselines, with notable gains in domains with limited training data.