RecIS: Sparse to Dense, A Unified Training Framework for Recommendation Models

📅 2025-09-25

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

To address the challenges of inefficient co-training between sparse (e.g., ID-based) and dense (e.g., LLM-based) components in industrial-scale recommendation systems, this paper proposes the first unified sparse-dense training framework built on the PyTorch ecosystem. Our method introduces a unified tensor abstraction, sparse gradient compression, and mixed-precision scheduling—enabling seamless integration of conventional sparse recommendation models with large-model enhancement modules without architectural modifications. Key innovations include efficient sharding and dynamic caching of sparse embedding tables, while maintaining full compatibility with standard dense training paradigms such as FSDP and DDP. Deployed across multiple online recommendation services at Alibaba, the framework supports trillion-parameter model training, achieving a 2.3× improvement in sparse computation throughput. Crucially, legacy sparse models retain full accuracy and exhibit significantly enhanced training stability post-migration.

Technology Category

Application Category

📝 Abstract

In this paper, we propose RecIS, a unified Sparse-Dense training framework designed to achieve two primary goals: 1. Unified Framework To create a Unified sparse-dense training framework based on the PyTorch ecosystem that meets the training needs of industrial-grade recommendation models that integrated with large models. 2.System Optimization To optimize the sparse component, offering superior efficiency over the TensorFlow-based recommendation models. The dense component, meanwhile, leverages existing optimization technologies within the PyTorch ecosystem. Currently, RecIS is being used in Alibaba for numerous large-model enhanced recommendation training tasks, and some traditional sparse models have also begun training in it.

Problem

Research questions and friction points this paper is trying to address.

Creating a unified sparse-dense training framework for recommendation models

Optimizing sparse component efficiency compared to TensorFlow-based models

Supporting industrial-grade recommendation models integrated with large models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified sparse-dense training framework using PyTorch ecosystem

Optimized sparse component for superior efficiency over TensorFlow

Leveraged existing PyTorch optimizations for the dense component

🔎 Similar Papers

No similar papers found.