LMMRec: LLM-driven Motivation-aware Multimodal Recommendation

📅 2026-02-05

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the limitations of existing recommender systems, which often neglect multimodal heterogeneous information—such as user reviews—and struggle to achieve cross-modal alignment and motivational consistency under noisy conditions. To overcome these challenges, the authors propose a model-agnostic framework that leverages large language models with chain-of-thought prompting to extract fine-grained user and item motivations from textual data. A dual-encoder architecture is employed to align interaction behaviors with textual motivations, while semantic priors, motivation coordination strategies, and an interaction-text correspondence mechanism are introduced to mitigate noise-induced interference and semantic drift. Extensive experiments demonstrate that the proposed approach achieves up to a 4.98% performance gain over state-of-the-art baselines across three benchmark datasets, highlighting its effectiveness and robustness.

Technology Category

Application Category

📝 Abstract

Motivation-based recommendation systems uncover user behavior drivers. Motivation modeling, crucial for decision-making and content preference, explains recommendation generation. Existing methods often treat motivation as latent variables from interaction data, neglecting heterogeneous information like review text. In multimodal motivation fusion, two challenges arise: 1) achieving stable cross-modal alignment amid noise, and 2) identifying features reflecting the same underlying motivation across modalities. To address these, we propose LLM-driven Motivation-aware Multimodal Recommendation (LMMRec), a model-agnostic framework leveraging large language models for deep semantic priors and motivation understanding. LMMRec uses chain-of-thought prompting to extract fine-grained user and item motivations from text. A dual-encoder architecture models textual and interaction-based motivations for cross-modal alignment, while Motivation Coordination Strategy and Interaction-Text Correspondence Method mitigate noise and semantic drift through contrastive learning and momentum updates. Experiments on three datasets show LMMRec achieves up to a 4.98\% performance improvement.

Problem

Research questions and friction points this paper is trying to address.

motivation-aware recommendation

multimodal fusion

cross-modal alignment

user motivation modeling

heterogeneous information

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models

Motivation-aware Recommendation

Multimodal Fusion