CS3: Efficient Online Capability Synergy for Two-Tower Recommendation

📅 2026-04-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

198K/year
🤖 AI Summary
This work addresses the limitations of dual-tower recommendation models, which suffer from restricted representational capacity, insufficient embedding space alignment, and absent cross-tower feature interaction due to architectural isolation, making it challenging to achieve high effectiveness under low-latency constraints. To overcome these issues, the authors propose CS3, a novel framework that enables online collaborative optimization of dual-tower models within millisecond-level latency budgets. CS3 integrates cyclic adaptive feature denoising, a lightweight cross-tower mutual-aware synchronization mechanism, and cascaded inter-stage knowledge sharing to enhance model collaboration. Designed as a plug-and-play module, CS3 is compatible with diverse backbone architectures and online learning paradigms. Experiments on three public benchmarks demonstrate significant improvements over strong baselines, and deployment in a large-scale advertising system yields an 8.36% revenue gain while maintaining real-time responsiveness.

Technology Category

Application Category

📝 Abstract
To balance effectiveness and efficiency in recommender systems, multi-stage pipelines commonly use lightweight two-tower models for large-scale candidate retrieval. However, the isolated two-tower architecture restricts representation capacity, embedding-space alignment, and cross-feature interactions. Existing solutions such as late interaction and knowledge distillation can mitigate these issues, but often increase latency or are difficult to deploy in online learning settings. We propose Capability Synergy (CS3), an efficient online framework that strengthens two-tower retrievers while preserving real-time constraints. CS3 introduces three mechanisms: (1) Cycle-Adaptive Structure for self-revision via adaptive feature denoising within each tower; (2) Cross-Tower Synchronization to improve alignment through lightweight mutual awareness between towers; and (3) Cascade-Model Sharing to enhance cross-stage consistency by reusing knowledge from downstream models. CS3 is plug-and-play with diverse two-tower backbones and compatible with online learning. Experiments on three public datasets show consistent gains over strong baselines, and deployment in a largescale advertising system yields up to 8.36% revenue improvement across three scenarios while maintaining ms-level latency.
Problem

Research questions and friction points this paper is trying to address.

two-tower recommendation
candidate retrieval
representation capacity
embedding-space alignment
cross-feature interactions
Innovation

Methods, ideas, or system contributions that make the work stand out.

two-tower recommendation
online learning
cross-tower alignment
capability synergy
efficient retrieval
🔎 Similar Papers
No similar papers found.