eSASRec: Enhancing Transformer-based Recommendations in a Modular Fashion

📅 2025-08-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing modular improvements to Transformer-based sequential recommendation models lack systematic benchmarking, leaving their composability and Pareto optimality unclear. This paper proposes eSASRec—a lightweight, feature-agnostic enhanced model—that systematically validates the additive benefits of module combinations under a unified production-grade evaluation framework. eSASRec integrates LiGR Transformer layers, the SASRec self-supervised training objective, and Sampled Softmax loss in an end-to-end manner. We provide the first empirical evidence that carefully designed modular composition significantly improves recommendation accuracy without incurring additional feature engineering overhead. On public benchmarks, eSASRec achieves a 23% accuracy gain over ActionPiece; in industrial-scale evaluations, it matches state-of-the-art models—including HSTU and FuXi—while consistently dominating the accuracy–coverage Pareto frontier.

Technology Category

Application Category

📝 Abstract
Since their introduction, Transformer-based models, such as SASRec and BERT4Rec, have become common baselines for sequential recommendations, surpassing earlier neural and non-neural methods. A number of following publications have shown that the effectiveness of these models can be improved by, for example, slightly updating the architecture of the Transformer layers, using better training objectives, and employing improved loss functions. However, the additivity of these modular improvements has not been systematically benchmarked - this is the gap we aim to close in this paper. Through our experiments, we identify a very strong model that uses SASRec's training objective, LiGR Transformer layers, and Sampled Softmax Loss. We call this combination eSASRec (Enhanced SASRec). While we primarily focus on realistic, production-like evaluation, in our preliminarily study we find that common academic benchmarks show eSASRec to be 23% more effective compared to the most recent state-of-the-art models, such as ActionPiece. In our main production-like benchmark, eSASRec resides on the Pareto frontier in terms of the accuracy-coverage tradeoff (alongside the recent industrial models HSTU and FuXi. As the modifications compared to the original SASRec are relatively straightforward and no extra features are needed (such as timestamps in HSTU), we believe that eSASRec can be easily integrated into existing recommendation pipelines and can can serve as a strong yet very simple baseline for emerging complicated algorithms. To facilitate this, we provide the open-source implementations for our models and benchmarks in repository https://github.com/blondered/transformer_benchmark
Problem

Research questions and friction points this paper is trying to address.

Benchmark modular improvements in Transformer-based recommendation models
Enhance SASRec with better training objectives and loss functions
Evaluate eSASRec's accuracy-coverage tradeoff in production-like settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular enhancement of Transformer-based models
Combines SASRec, LiGR layers, Sampled Softmax
Simple integration into existing recommendation pipelines
🔎 Similar Papers
No similar papers found.
D
Daria Tikhonovich
MTS
N
Nikita Zelinskiy
MTS
Aleksandr V. Petrov
Aleksandr V. Petrov
Research Scientist, Spotify
Recommender SystemsInformation RetrievalNatural Language ProcessingDeep Learning
M
Mayya Spirina
MTS
A
Andrei Semenov
Yandex
A
Andrey V. Savchenko
Sber AI Lab
S
Sergei Kuliev
MTS