PI2I: A Personalized Item-Based Collaborative Filtering Retrieval Framework

📅 2026-01-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of traditional item-based collaborative filtering and two-tower models, which suffer from rigid truncation strategies and weak interaction modeling, hindering fine-grained user interest capture. The authors propose PI2I, a two-stage retrieval framework: in the first stage, a relaxed truncation threshold expands the candidate set to improve recall; in the second stage, an interactive scoring model replaces inner product computation, and negative samples are constructed from trigger–target item pairs to align training with online inference. By integrating flexible index construction with personalized interaction modeling, PI2I significantly enhances recommendation accuracy. Offline experiments demonstrate superior performance over classical collaborative filtering and parity with two-tower models. Deployed on Taobao’s “Guess You Like” feed, it achieves a 1.05% increase in transaction conversion rate and releases a public dataset containing 130 million interactions.

Technology Category

Application Category

📝 Abstract
Efficiently selecting relevant content from vast candidate pools is a critical challenge in modern recommender systems. Traditional methods, such as item-to-item collaborative filtering (CF) and two-tower models, often fall short in capturing the complex user-item interactions due to uniform truncation strategies and overdue user-item crossing. To address these limitations, we propose Personalized Item-to-Item (PI2I), a novel two-stage retrieval framework that enhances the personalization capabilities of CF. In the first Indexer Building Stage (IBS), we optimize the retrieval pool by relaxing truncation thresholds to maximize Hit Rate, thereby temporarily retaining more items users might be interested in. In the second Personalized Retrieval Stage (PRS), we introduce an interactive scoring model to overcome the limitations of inner product calculations, allowing for richer modeling of intricate user-item interactions. Additionally, we construct negative samples based on the trigger-target (item-to-item) relationship, ensuring consistency between offline training and online inference. Offline experiments on large-scale real-world datasets demonstrate that PI2I outperforms traditional CF methods and rivals Two-Tower models. Deployed in the"Guess You Like"section on Taobao, PI2I achieved a 1.05% increase in online transaction rates. In addition, we have released a large-scale recommendation dataset collected from Taobao, containing 130 million real-world user interactions used in the experiments of this paper. The dataset is publicly available at https://huggingface.co/datasets/PI2I/PI2I, which could serve as a valuable benchmark for the research community.
Problem

Research questions and friction points this paper is trying to address.

recommender systems
item-to-item collaborative filtering
user-item interactions
retrieval
personalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Personalized Item-to-Item
Two-stage Retrieval
Interactive Scoring Model
Trigger-Target Negative Sampling
Collaborative Filtering
🔎 Similar Papers
No similar papers found.
S
Shaoqing Wang
Alibaba Group
Y
Ying Ma
Alibaba Group
Kairui Fu
Kairui Fu
Zhejiang University
Z
Ziyang Wang
Alibaba Group
D
Dunxian Huang
Alibaba Group
Y
Yuliang Yan
Alibaba Group
Jian Wu
Jian Wu
Unknown affiliation
Music Generation