From Knots to Knobs: Towards Steerable Collaborative Filtering Using Sparse Autoencoders

📅 2026-01-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of extracting interpretable and controllable recommendation features solely from user interaction data to enable targeted guidance in recommender systems. To this end, we propose integrating a sparse autoencoder (SAE) between the encoder and decoder of a collaborative filtering autoencoder (CFAE)—a novel application of SAE in collaborative filtering—to learn latent features with single-semantic meaning and establish explicit mappings between individual neurons and human-interpretable semantic concepts. By selectively activating specific neurons, the model enables precise control over the direction of generated recommendations. Experimental results demonstrate that the proposed approach effectively extracts highly interpretable features and supports flexible, fine-grained manipulation of recommendation outcomes.

Technology Category

Application Category

📝 Abstract
Sparse autoencoders (SAEs) have recently emerged as pivotal tools for introspection into large language models. SAEs can uncover high-quality, interpretable features at different levels of granularity and enable targeted steering of the generation process by selectively activating specific neurons in their latent activations. Our paper is the first to apply this approach to collaborative filtering, aiming to extract similarly interpretable features from representations learned purely from interaction signals. In particular, we focus on a widely adopted class of collaborative autoencoders (CFAEs) and augment them by inserting an SAE between their encoder and decoder networks. We demonstrate that such representation is largely monosemantic and propose suitable mapping functions between semantic concepts and individual neurons. We also evaluate a simple yet effective method that utilizes this representation to steer the recommendations in a desired direction.
Problem

Research questions and friction points this paper is trying to address.

collaborative filtering
sparse autoencoders
interpretable features
representation steering
monosemanticity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse Autoencoders
Collaborative Filtering
Interpretable Features
Steerable Recommendations
Monosemantic Representations
Martin Spišák
Martin Spišák
Applied Scientist, Recombee
recommender systemssparse autoencoders
L
Ladislav Peška
Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
Petr Škoda
Petr Škoda
Univerzita Karlova v Praze
Linked DataData InteroperabilityKnowledge Graphs
Vojtěch Vančura
Vojtěch Vančura
Recombee
Recommender SystemsSparse Representations
R
Rodrigo Alves
Faculty of Information Technology, Czech Technical University, Prague, Czech Republic