CorefInst: Leveraging LLMs for Multilingual Coreference Resolution

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

Coreference resolution (CR) has long been constrained by task-specific architectures and encoder-centric paradigms, limiting generalization and multilingual adaptability. This work proposes the first decoder-only large language model (LLM)-based approach for multilingual coreference resolution, eliminating reliance on dedicated encoders. By leveraging instruction tuning, it unifies modeling of both explicit mentions and zero pronouns within a single framework and enables controllable inference. We design five instruction templates and fine-tune Llama 3.1, Gemma 2, and Mistral 0.3 on the multilingual CorefUD v1.2 benchmark. Our method achieves state-of-the-art performance, with the best model outperforming Corpipe-24 (single-stage) by an average +2.0 F1 points. This work establishes a novel decoder-only LLM paradigm for coreference resolution, substantially improving model generality, inference controllability, and cross-lingual transfer capability.

Technology Category

Application Category

📝 Abstract

Coreference Resolution (CR) is a crucial yet challenging task in natural language understanding, often constrained by task-specific architectures and encoder-based language models that demand extensive training and lack adaptability. This study introduces the first multilingual CR methodology which leverages decoder-only LLMs to handle both overt and zero mentions. The article explores how to model the CR task for LLMs via five different instruction sets using a controlled inference method. The approach is evaluated across three LLMs; Llama 3.1, Gemma 2, and Mistral 0.3. The results indicate that LLMs, when instruction-tuned with a suitable instruction set, can surpass state-of-the-art task-specific architectures. Specifically, our best model, a fully fine-tuned Llama 3.1 for multilingual CR, outperforms the leading multilingual CR model (i.e., Corpipe 24 single stage variant) by 2 pp on average across all languages in the CorefUD v1.2 dataset collection.

Problem

Research questions and friction points this paper is trying to address.

Addressing multilingual coreference resolution limitations with LLMs

Overcoming task-specific architecture constraints in coreference resolution

Enhancing adaptability and performance across diverse language datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging decoder-only LLMs for multilingual coreference

Using five instruction sets for task modeling

Employing controlled inference method for resolution

🔎 Similar Papers

No similar papers found.