PatchRec: Multi-Grained Patching for Efficient LLM-based Sequential Recommendation

📅 2025-01-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the truncation of users’ long-term behavioral sequences in LLM-based sequential recommendation caused by context-length limitations, this paper proposes a multi-granularity chunking compression framework. It compresses item titles into compact item patches and dynamically aggregates them into session patches based on temporal proximity, enabling efficient modeling of ultra-long sequences. We introduce a novel dynamic hierarchical compression paradigm, integrating two-stage patch pretraining—driven by contrastive learning—with instruction fine-tuning to endow LLMs with cross-granularity sequential understanding capabilities. Evaluated on the Goodreads dataset, our method achieves a 32% improvement in Hit Rate@20 while consuming only 7% of the original token count, substantially reducing computational overhead and enabling real-time recommendation over extremely long user behavioral sequences.

Technology Category

Application Category

📝 Abstract
Large Language Models for sequential recommendation (LLM4SR), which transform user-item interactions into language modeling, have shown promising results. However, due to the limitations of context window size and the computational costs associated with Large Language Models (LLMs), current approaches primarily truncate user history by only considering the textual information of items from the most recent interactions in the input prompt. This truncation fails to fully capture the long-term behavioral patterns of users. To address this, we propose a multi-grained patching framework -- PatchRec. It compresses the textual tokens of an item title into a compact item patch, and further compresses multiple item patches into a denser session patch, with earlier interactions being compressed to a greater degree. The framework consists of two stages: (1) Patch Pre-training, which familiarizes LLMs with item-level compression patterns, and (2) Patch Fine-tuning, which teaches LLMs to model sequences at multiple granularities. Through this simple yet effective approach, empirical results demonstrate that PatchRec outperforms existing methods, achieving significant performance gains with fewer tokens fed to the LLM. Specifically, PatchRec shows up to a 32% improvement in HR@20 on the Goodreads dataset over uncompressed baseline, while using only 7% of the tokens. This multi-grained sequence modeling paradigm, with an adjustable compression ratio, enables LLMs to be efficiently deployed in real-world recommendation systems that handle extremely long user behavior sequences.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Recommendation Systems
User Browsing History
Innovation

Methods, ideas, or system contributions that make the work stand out.

PatchRec
sequence recommendation
compressed learning
🔎 Similar Papers
No similar papers found.