CorefInst: Leveraging LLMs for Multilingual Coreference Resolution

📅 2025-09-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Coreference resolution (CR) has long been constrained by task-specific architectures and encoder-centric paradigms, limiting generalization and multilingual adaptability. This work proposes the first decoder-only large language model (LLM)-based approach for multilingual coreference resolution, eliminating reliance on dedicated encoders. By leveraging instruction tuning, it unifies modeling of both explicit mentions and zero pronouns within a single framework and enables controllable inference. We design five instruction templates and fine-tune Llama 3.1, Gemma 2, and Mistral 0.3 on the multilingual CorefUD v1.2 benchmark. Our method achieves state-of-the-art performance, with the best model outperforming Corpipe-24 (single-stage) by an average +2.0 F1 points. This work establishes a novel decoder-only LLM paradigm for coreference resolution, substantially improving model generality, inference controllability, and cross-lingual transfer capability.

Technology Category

Application Category

📝 Abstract
Coreference Resolution (CR) is a crucial yet challenging task in natural language understanding, often constrained by task-specific architectures and encoder-based language models that demand extensive training and lack adaptability. This study introduces the first multilingual CR methodology which leverages decoder-only LLMs to handle both overt and zero mentions. The article explores how to model the CR task for LLMs via five different instruction sets using a controlled inference method. The approach is evaluated across three LLMs; Llama 3.1, Gemma 2, and Mistral 0.3. The results indicate that LLMs, when instruction-tuned with a suitable instruction set, can surpass state-of-the-art task-specific architectures. Specifically, our best model, a fully fine-tuned Llama 3.1 for multilingual CR, outperforms the leading multilingual CR model (i.e., Corpipe 24 single stage variant) by 2 pp on average across all languages in the CorefUD v1.2 dataset collection.
Problem

Research questions and friction points this paper is trying to address.

Addressing multilingual coreference resolution limitations with LLMs
Overcoming task-specific architecture constraints in coreference resolution
Enhancing adaptability and performance across diverse language datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging decoder-only LLMs for multilingual coreference
Using five instruction sets for task modeling
Employing controlled inference method for resolution
🔎 Similar Papers
No similar papers found.
T
Tuğba Pamay Arslan
ITÜ NLP Research Group, Department of Artificial Intelligence and Data Engineering, Istanbul Technical University
E
Emircan Erol
ITÜ NLP Research Group, Department of Artificial Intelligence and Data Engineering, Istanbul Technical University
Gülşen Eryiğit
Gülşen Eryiğit
Professor at Artificial Intelligence & Data Engineering, Istanbul Technical University
Natural Language ProcessingCALLArtificial IntelligenceMachine LearningDeep Learning