🤖 AI Summary
Existing LLM-based recommender systems neglect the inherent multi-view graph structure in recommendation scenarios, limiting their ability to model high-order semantic associations. To address this, we propose a hypergraph-enhanced LLM recommendation framework that innovatively unifies user-item hypergraph modeling with sequential behavior modeling. Specifically, we construct dual hypergraphs—semantic and interaction hypergraphs—and employ hypergraph convolution to capture global high-order relational patterns. Concurrently, we design a collaborative contrastive learning mechanism that jointly encodes global structural context from the hypergraph and local temporal dynamics from behavioral sequences. Our method achieves deep synergy via hypergraph neural networks, LLM architecture injection, and multimodal feature fusion. Extensive experiments on multiple benchmark datasets demonstrate significant improvements over state-of-the-art methods, validating the effectiveness and generalizability of jointly leveraging hypergraph structural priors and behavioral sequences to enhance LLM-based recommendation.
📝 Abstract
The burgeoning presence of Large Language Models (LLM) is propelling the development of personalized recommender systems. Most existing LLM-based methods fail to sufficiently explore the multi-view graph structure correlations inherent in recommendation scenarios. To this end, we propose a novel framework, Hypergraph Enhanced LLM Learning for multimodal Recommendation (HeLLM), designed to equip LLMs with the capability to capture intricate higher-order semantic correlations by fusing graph-level contextual signals with sequence-level behavioral patterns. In the recommender pre-training phase, we design a user hypergraph to uncover shared interest preferences among users and an item hypergraph to capture correlations within multimodal similarities among items. The hypergraph convolution and synergistic contrastive learning mechanism are introduced to enhance the distinguishability of learned representations. In the LLM fine-tuning phase, we inject the learned graph-structured embeddings directly into the LLM's architecture and integrate sequential features capturing each user's chronological behavior. This process enables hypergraphs to leverage graph-structured information as global context, enhancing the LLM's ability to perceive complex relational patterns and integrate multimodal information, while also modeling local temporal dynamics. Extensive experiments demonstrate the superiority of our proposed method over state-of-the-art baselines, confirming the advantages of fusing hypergraph-based context with sequential user behavior in LLMs for recommendation.