🤖 AI Summary
Recommender systems face a fundamental challenge: reconciling the accuracy of collaborative filtering with the semantic understanding and generalization capabilities of large language models (LLMs), while overcoming the semantic opacity of collaborative signals and the inability of text-only LLMs to model implicit user preferences. To address this, we propose IDIOMoE—a novel architecture that treats item-ID sequences as “native dialects” within the language space. It introduces a token-type-gated Mixture-of-Experts (MoE) feed-forward network, explicitly separating textual experts from item-ID experts to prevent cross-modal representation entanglement. This design preserves the pre-trained LLM’s linguistic competence while seamlessly integrating collaborative signals. Extensive experiments on multiple public and proprietary datasets demonstrate significant improvements in recommendation accuracy. Moreover, IDIOMoE natively supports natural-language querying and enables interpretable recommendations through its modular, semantically grounded architecture.
📝 Abstract
While collaborative filtering delivers predictive accuracy and efficiency, and Large Language Models (LLMs) enable expressive and generalizable reasoning, modern recommendation systems must bring these strengths together. Growing user expectations, such as natural-language queries and transparent explanations, further highlight the need for a unified approach. However, doing so is nontrivial. Collaborative signals are often token-efficient but semantically opaque, while LLMs are semantically rich but struggle to model implicit user preferences when trained only on textual inputs. This paper introduces Item-ID + Oral-language Mixture-of-Experts Language Model (IDIOMoE), which treats item interaction histories as a native dialect within the language space, enabling collaborative signals to be understood in the same way as natural language. By splitting the Feed Forward Network of each block of a pretrained LLM into a separate text expert and an item expert with token-type gating, our method avoids destructive interference between text and catalog modalities. IDIOMoE demonstrates strong recommendation performance across both public and proprietary datasets, while preserving the text understanding of the pretrained model.