π€ AI Summary
Large language models (LLMs) often fail in long-context question answering due to ambiguous coreference, leading to semantic misalignment. To address this, we propose Long Question Coreference Adaptation (LQCA), the first method to deeply integrate context-optimized coreference resolution into the LLM QA pipeline. LQCA comprises four sequential stages: intra-subdocument coreference resolution, mention-distance modeling, representative mention selection, and substitution-based QAβforming an end-to-end, plug-and-play semantic bridging framework. Technically, it synergizes rule-augmented neural coreference resolution, cross-paragraph mention graph modeling, dynamic mention aggregation, and context-aware prompt engineering, effectively mitigating context dilution and coreference discontinuity. Evaluated across multiple LLMs (e.g., o1-mini, GPT-4o) and mainstream long-text QA benchmarks, LQCA achieves an average accuracy gain of 12.7%. The implementation is publicly available.
π Abstract
Large language models (LLMs) have shown remarkable capabilities in natural language processing; however, they still face difficulties when tasked with understanding lengthy contexts and executing effective question answering. These challenges often arise due to the complexity and ambiguity present in longer texts. To enhance the performance of LLMs in such scenarios, we introduce the Long Question Coreference Adaptation (LQCA) method. This innovative framework focuses on coreference resolution tailored to long contexts, allowing the model to identify and manage references effectively. The LQCA method encompasses four key steps: resolving coreferences within sub-documents, computing the distances between mentions, defining a representative mention for coreference, and answering questions through mention replacement. By processing information systematically, the framework provides easier-to-handle partitions for LLMs, promoting better understanding. Experimental evaluations on a range of LLMs and datasets have yielded positive results, with a notable improvements on OpenAI-o1-mini and GPT-4o models, highlighting the effectiveness of leveraging coreference resolution to bridge context gaps in question answering. Our code is public at https://github.com/OceannTwT/LQCA.