🤖 AI Summary
This work addresses the strong anisotropy inherent in item embeddings generated by large language models (LLMs), which leads to concentrated vector distributions and geometric imbalance, thereby hindering the effective capture of collaborative signals in sequential recommendation. To mitigate this issue, the paper introduces the first explicit modeling and control of such anisotropy through a linear autoencoder (LAE)-based embedding refinement mechanism. The approach enhances geometric uniformity by regulating dimensional dispersion via L2 regularization while preserving semantic structure through reconstruction loss. Extensive experiments demonstrate that the proposed method achieves significant performance gains, with up to 12.4% improvement in Recall@20 and 11.8% in NDCG@20, substantially outperforming existing LLM-enhanced sequential recommendation models.
📝 Abstract
Recent advances in the LLM-as-Extractor paradigm leverage large language models (LLMs) to transfer semantically rich item embeddings into sequential recommendation (SR) backbones. However, LLM-generated embeddings often suffer from strong anisotropy. Most vectors are concentrated in similar directions, resulting in a geometric imbalance that makes it difficult to adapt to collaborative signals during fine-tuning. To address this challenge, we propose Anisotropy-Controllable Embedding (ACE), which explicitly controls the anisotropy of LLM-generated embeddings. Specifically, ACE utilizes a linear autoencoder (LAE) to reshape the embedding distribution while preserving its semantic structure. In this process, the L2-regularization term mitigates the anisotropy by controlling the dispersion of embedding dimensions, while the reconstruction loss maintains semantic relationships among items. That is, ACE balances geometric uniformity and semantic embedding preservation for more stable learning. Extensive experiments demonstrate that ACE consistently outperforms existing LLM-enhanced SR models, yielding improvements of up to 12.4% and 11.8% in Recall@20 and NDCG@20, respectively.