DS4RS: Community-Driven and Explainable Dataset Search Engine for Recommender System Research

📅 2025-08-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Recommendation systems research suffers from low dataset discoverability and poor reproducibility due to fragmented datasets and heterogeneous metadata. To address this, we propose RecDatasetSearch—a community-driven, interpretable dataset search engine. Methodologically, it integrates structured metadata modeling with semantic retrieval powered by pretrained language models, enabling multi-attribute joint queries across dataset names, descriptions, and recommendation task domains. Crucially, it introduces an interpretability mechanism that provides fine-grained relevance attribution for search results and establishes an open, versioned, community-contributed metadata curation paradigm. Experimental evaluation demonstrates significant improvements in retrieval accuracy and transparency. The platform is fully open-sourced and publicly deployed, thereby enhancing reproducibility and fostering sustainable collaboration in recommendation research.

Technology Category

Application Category

📝 Abstract
Accessing suitable datasets is critical for research and development in recommender systems. However, finding datasets that match specific recommendation task or domains remains a challenge due to scattered sources and inconsistent metadata. To address this gap, we propose a community-driven and explainable dataset search engine tailored for recommender system research. Our system supports semantic search across multiple dataset attributes, such as dataset names, descriptions, and recommendation domain, and provides explanations of search relevance to enhance transparency. The system encourages community participation by allowing users to contribute standardized dataset metadata in public repository. By improving dataset discoverability and search interpretability, the system facilitates more efficient research reproduction. The platform is publicly available at: https://ds4rs.com.
Problem

Research questions and friction points this paper is trying to address.

Finding suitable datasets for recommender system research
Scattered sources and inconsistent metadata challenge dataset discovery
Lack of transparent and explainable dataset search tools
Innovation

Methods, ideas, or system contributions that make the work stand out.

Community-driven dataset search engine
Semantic search with multiple attributes
Explainable search relevance for transparency
🔎 Similar Papers
No similar papers found.