🤖 AI Summary
This study systematically evaluates neural approaches for cross-lingual information access, focusing on Chinese, Persian, and Russian news articles and Chinese scientific abstracts across four tasks: cross-lingual news retrieval, multilingual retrieval, report generation, and cross-lingual technical document retrieval. It introduces the first multi-lingual, multi-task evaluation within the TREC framework, integrating machine translation, multilingual pre-trained models (mBERT/XLM-R), and text generation techniques to enable cross-lingual semantic representation and end-to-end retrieval. Twenty-seven submissions from five teams were received, covering eight subtasks; the effort culminated in the first cross-lingual retrieval benchmark targeting low-resource languages. Results demonstrate that neural methods significantly improve retrieval performance for non-English languages, validating the efficacy of unified semantic representations and joint task modeling. The work establishes a reproducible evaluation paradigm and a foundational technical baseline for multilingual information retrieval.
📝 Abstract
The principal goal of the TREC Neural Cross-Language Information Retrieval (NeuCLIR) track is to study the effect of neural approaches on cross-language information access. The track has created test collections containing Chinese, Persian, and Russian news stories and Chinese academic abstracts. NeuCLIR includes four task types: Cross-Language Information Retrieval (CLIR) from news, Multilingual Information Retrieval (MLIR) from news, Report Generation from news, and CLIR from technical documents. A total of 274 runs were submitted by five participating teams (and as baselines by the track coordinators) for eight tasks across these four task types. Task descriptions and the available results are presented.