Can LLMs Detect Ambiguous Plural Reference? An Analysis of Split-Antecedent and Mereological Reference

📅 2025-10-06

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

This study investigates large language models’ (LLMs) representation and interpretation of explicit versus ambiguous plural anaphora—including split antecedents and part–whole references. Methodologically, it employs next-token prediction, pronoun generation, and explanation-generation tasks, augmented by multi-prompt strategies and cross-contextual experimental designs to quantify LLMs’ ability—without explicit prompting—to detect ambiguity, perceive candidate antecedents, and align with human anaphoric preferences. Results reveal that while LLMs partially capture the latent referential scope of ambiguous plural pronouns, their interpretive choices systematically diverge from human preferences; critically, they lack active ambiguity detection capability, and performance varies inconsistently across tasks. This work identifies a fundamental limitation in current LLMs’ plural anaphora comprehension, establishing a novel benchmark and empirical foundation for advancing coreference resolution modeling and evaluation.

Technology Category

Application Category

📝 Abstract

Our goal is to study how LLMs represent and interpret plural reference in ambiguous and unambiguous contexts. We ask the following research questions: (1) Do LLMs exhibit human-like preferences in representing plural reference? (2) Are LLMs able to detect ambiguity in plural anaphoric expressions and identify possible referents? To address these questions, we design a set of experiments, examining pronoun production using next-token prediction tasks, pronoun interpretation, and ambiguity detection using different prompting strategies. We then assess how comparable LLMs are to humans in formulating and interpreting plural reference. We find that LLMs are sometimes aware of possible referents of ambiguous pronouns. However, they do not always follow human reference when choosing between interpretations, especially when the possible interpretation is not explicitly mentioned. In addition, they struggle to identify ambiguity without direct instruction. Our findings also reveal inconsistencies in the results across different types of experiments.

Problem

Research questions and friction points this paper is trying to address.

LLMs detect ambiguous plural reference in language

Assess LLM interpretation of split-antecedent and mereological references

Compare LLM and human plural reference processing abilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing plural reference using next-token prediction

Evaluating ambiguity detection through varied prompting strategies

Comparing LLM performance with human interpretation patterns

🔎 Similar Papers

No similar papers found.