🤖 AI Summary
In offline recommendation experiments, dataset selection often lacks principled justification, relying excessively on a few dense benchmarks—leading to unreliable conclusions in sparse or cold-start scenarios. Although the Algorithm Performance Space (APS) framework theoretically enables systematic dataset–algorithm alignment, its practical adoption is hindered by the absence of intuitive, interactive visualization tools. To bridge this gap, we propose APS Explorer—the first interactive web-based tool for visualizing the APS in recommender systems. It integrates three core components: (i) a PCA-reduced similarity graph of datasets, (ii) a dynamic meta-feature table, and (iii) pairwise algorithm performance heatmaps. Evaluated across mainstream recommendation datasets, APS Explorer enables effective identification of datasets aligned with specific experimental requirements (e.g., sparsity, cold-start robustness), thereby enhancing reproducibility and reliability of offline evaluation.
📝 Abstract
Dataset selection is crucial for offline recommender system experiments, as mismatched data (e.g., sparse interaction scenarios require datasets with low user-item density) can lead to unreliable results. Yet, 86% of ACM RecSys 2024 papers provide no justification for their dataset choices, with most relying on just four datasets: Amazon (38%), MovieLens (34%), Yelp (15%), and Gowalla (12%). While Algorithm Performance Spaces (APS) were proposed to guide dataset selection, their adoption has been limited due to the absence of an intuitive, interactive tool for APS exploration. Therefore, we introduce the APS Explorer, a web-based visualization tool for interactive APS exploration, enabling data-driven dataset selection. The APS Explorer provides three interactive features: (1) an interactive PCA plot showing dataset similarity via performance patterns, (2) a dynamic meta-feature table for dataset comparisons, and (3) a specialized visualization for pairwise algorithm performance.