FACE: A General Framework for Mapping Collaborative Filtering Embeddings into LLM Tokens

📅 2025-10-17

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

Collaborative filtering (CF) embeddings lack semantic interpretability, hindering effective utilization by large language models (LLMs). To address this, we propose FACE—a novel framework that achieves semantic alignment of CF embeddings into the LLM’s token space without fine-tuning the LLM. FACE first decouples user- and item-specific structural information via a projection module; then employs a quantized autoencoder to map the embeddings onto discrete, semantically meaningful LLM tokens; finally enforces semantic consistency through contrastive learning. The resulting descriptors are highly interpretable, human-readable, and support model-agnostic, plug-and-play deployment. Extensive experiments on three real-world datasets demonstrate significant improvements in recommendation accuracy. Human evaluation further confirms that FACE-generated descriptors exhibit strong readability and semantic plausibility, validating both functional efficacy and linguistic coherence.

Technology Category

Application Category

📝 Abstract

Recently, large language models (LLMs) have been explored for integration with collaborative filtering (CF)-based recommendation systems, which are crucial for personalizing user experiences. However, a key challenge is that LLMs struggle to interpret the latent, non-semantic embeddings produced by CF approaches, limiting recommendation effectiveness and further applications. To address this, we propose FACE, a general interpretable framework that maps CF embeddings into pre-trained LLM tokens. Specifically, we introduce a disentangled projection module to decompose CF embeddings into concept-specific vectors, followed by a quantized autoencoder to convert continuous embeddings into LLM tokens (descriptors). Then, we design a contrastive alignment objective to ensure that the tokens align with corresponding textual signals. Hence, the model-agnostic FACE framework achieves semantic alignment without fine-tuning LLMs and enhances recommendation performance by leveraging their pre-trained capabilities. Empirical results on three real-world recommendation datasets demonstrate performance improvements in benchmark models, with interpretability studies confirming the interpretability of the descriptors. Code is available in https://github.com/YixinRoll/FACE.

Problem

Research questions and friction points this paper is trying to address.

Maps collaborative filtering embeddings into LLM tokens

Enables semantic interpretation of non-semantic recommendation embeddings

Improves recommendation performance without fine-tuning language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mapping CF embeddings into LLM tokens

Using disentangled projection and quantized autoencoder

Achieving semantic alignment without fine-tuning LLMs

🔎 Similar Papers

TokenRec: Learning to Tokenize ID for LLM-based Generative Recommendation