A Learnable Fully Interacted Two-Tower Model for Pre-Ranking System

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Dual-tower models are efficient for pre-ranking but suffer from limited expressiveness due to strict structural decoupling between user and item towers, which precludes cross-tower feature interaction. To address this, we propose FIT (Learnable Full-Interaction Twin-tower), a novel architecture that breaks the decoupling bottleneck while preserving inference efficiency. FIT introduces two key innovations: (1) a learnable meta-matrix enabling early-stage cross-tower feature alignment, and (2) a lightweight similarity scoring module for fine-grained late-stage interaction modeling. Crucially, FIT supports end-to-end joint optimization without compromising the computational benefits of the twin-tower paradigm. Extensive experiments on multiple public benchmarks demonstrate that FIT consistently outperforms state-of-the-art pre-ranking models—including YouTube DNN, DSSM, and TwinBERT—achieving 3.2–7.8% absolute gains in Recall@10 while increasing inference latency by less than 5%, thus striking an optimal balance between accuracy and latency.

Technology Category

Application Category

📝 Abstract

Pre-ranking plays a crucial role in large-scale recommender systems by significantly improving the efficiency and scalability within the constraints of providing high-quality candidate sets in real time. The two-tower model is widely used in pre-ranking systems due to a good balance between efficiency and effectiveness with decoupled architecture, which independently processes user and item inputs before calculating their interaction (e.g. dot product or similarity measure). However, this independence also leads to the lack of information interaction between the two towers, resulting in less effectiveness. In this paper, a novel architecture named learnable Fully Interacted Two-tower Model (FIT) is proposed, which enables rich information interactions while ensuring inference efficiency. FIT mainly consists of two parts: Meta Query Module (MQM) and Lightweight Similarity Scorer (LSS). Specifically, MQM introduces a learnable item meta matrix to achieve expressive early interaction between user and item features. Moreover, LSS is designed to further obtain effective late interaction between the user and item towers. Finally, experimental results on several public datasets show that our proposed FIT significantly outperforms the state-of-the-art baseline pre-ranking models.

Problem

Research questions and friction points this paper is trying to address.

Enabling rich information interaction between user and item towers

Improving pre-ranking model effectiveness while maintaining efficiency

Addressing lack of feature interaction in two-tower recommendation systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learnable meta matrix for early interaction

Lightweight scorer for late interaction

Fully interacted two-tower architecture

🔎 Similar Papers

No similar papers found.