🤖 AI Summary
This work addresses the limited expressiveness of large-scale dense models in recommender systems caused by embedding collapse. It is the first to uncover the dynamic mechanism underlying embedding collapse in RankMixer and introduces RankElastor, a novel architecture with theoretical guarantees. RankElastor integrates a parameterized full-mixing mechanism, a GLU-augmented feed-forward network (P-FFN), and dynamic effective rank modeling to enhance the robustness of representation spectra through spectral analysis. Extensive experiments on multiple large-scale industrial datasets demonstrate that RankElastor substantially mitigates embedding collapse, consistently improves recommendation performance, and exhibits stable model scalability.
📝 Abstract
Scaling recommendation models is a central challenge in recommender systems. Recently, RankMixer has emerged as an effective solution, operating on a unified token representation and alternating between token mixing and per-token feedforward networks (P-FFNs) to achieve scalable performance. However, RankMixer suffers from \textit{embedding collapse}, where learned representations have low effective rank, limiting expressivity and underutilizing the expanded representation space. Through empirical analysis and theoretical insights, we identify rigid token mixing and P-FFN modules as the primary causes of this phenomenon, jointly inducing a \textbf{damped oscillatory trajectory} in effective-rank evolution across layers. To address it, we propose RankElastor, a novel architecture that produces spectrum-robust representations with provable collapse mitigation. RankElastor introduces two components: (i) \textbf{parameterized full mixing}, which enables expressive token mixing with improved spectral robustness; and (ii) \textbf{GLU-improved P-FFNs}, which stabilize representation spectra through GLU-style FFN modules. Extensive experiments on large-scale industrial datasets demonstrate that RankElastor consistently improves recommendation performance, mitigates embedding collapse, and exhibits robust scaling behavior. Code is available at this GitHub repository: https://github.com/vasile-paskardlgm/RankElastor