ImCoref-CeS: An Improved Lightweight Pipeline for Coreference Resolution with LLM-based Checker-Splitter Refinement

📅 2025-10-11

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Coreference resolution faces a critical challenge in effectively integrating supervised small-model approaches with the reasoning capabilities of large language models (LLMs). This paper proposes a lightweight fusion framework: first, a supervised detection-clustering pipeline built upon small language models, enhanced with a bridging module for long-text encoding and a dual-affine scoring mechanism to model positional relations between mentions; second, an LLM deployed as a multi-role “Checker-Splitter” agent—without fine-tuning—that automatically validates mention validity and detects/splits erroneous clusters via prompt engineering. The framework enables efficient, LLM-augmented post-processing while preserving the strengths of supervised modeling. It achieves significant improvements over state-of-the-art methods on multiple benchmarks, including OntoNotes and LitBank, demonstrating both the feasibility and superiority of synergistic optimization between supervised models and LLM-based reasoning.

Technology Category

Application Category

📝 Abstract

Coreference Resolution (CR) is a critical task in Natural Language Processing (NLP). Current research faces a key dilemma: whether to further explore the potential of supervised neural methods based on small language models, whose detect-then-cluster pipeline still delivers top performance, or embrace the powerful capabilities of Large Language Models (LLMs). However, effectively combining their strengths remains underexplored. To this end, we propose extbf{ImCoref-CeS}, a novel framework that integrates an enhanced supervised model with LLM-based reasoning. First, we present an improved CR method ( extbf{ImCoref}) to push the performance boundaries of the supervised neural method by introducing a lightweight bridging module to enhance long-text encoding capability, devising a biaffine scorer to comprehensively capture positional information, and invoking a hybrid mention regularization to improve training efficiency. Importantly, we employ an LLM acting as a multi-role Checker-Splitter agent to validate candidate mentions (filtering out invalid ones) and coreference results (splitting erroneous clusters) predicted by ImCoref. Extensive experiments demonstrate the effectiveness of ImCoref-CeS, which achieves superior performance compared to existing state-of-the-art (SOTA) methods.

Problem

Research questions and friction points this paper is trying to address.

Improving coreference resolution by combining supervised models with LLM reasoning

Enhancing long-text encoding and positional information capture in CR

Validating and refining coreference clusters using LLM-based checker-splitter

Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhanced supervised model with lightweight bridging module

Biaffine scorer capturing comprehensive positional information

LLM-based Checker-Splitter agent refining coreference results

🔎 Similar Papers

No similar papers found.