🤖 AI Summary
This work addresses the mismatch in input granularity, semantic focus, and training signals between document-to-document (doc-doc) and query-to-document (q-doc) paradigms in scientific document retrieval. To bridge this gap, the authors propose a unified fine-grained aware retrieval framework that jointly supports both retrieval tasks within a single architecture for the first time. The framework introduces learnable aspect anchors to align the structural elements of scientific documents with user query intents, and employs adaptive multi-granularity aggregation coupled with multi-task joint training for end-to-end optimization. Compatible with various foundation models, the approach achieves significant performance gains over existing methods across multiple scientific retrieval benchmarks, demonstrating its effectiveness and generalizability.
📝 Abstract
Existing scientific document retrieval (SDR) methods primarily rely on document-centric representations learned from inter-document relationships for document-document (doc-doc) retrieval. However, the rise of LLMs and RAG has shifted SDR toward question-driven retrieval, where documents are retrieved in response to natural-language questions (q-doc). This change has led to systematic mismatches between document-centric models and question-driven retrieval, including (1) input granularity (long documents vs. short questions), (2) semantic focus (scientific discourse structure vs. specific question intent), and (3) training signals (citation-based similarity vs. question-oriented relevance). To this end, we propose UniFAR, a Unified Facet-Aware Retrieval framework to jointly support doc-doc and q-doc SDR within a single architecture. UniFAR reconciles granularity differences through adaptive multi-granularity aggregation, aligns document structure with question intent via learnable facet anchors, and unifies doc-doc and q-doc supervision through joint training. Experimental results show that UniFAR consistently outperforms prior methods across multiple retrieval tasks and base models, confirming its effectiveness and generality.