Language Models at the Syntax-Semantics Interface: A Case Study of the Long-Distance Binding of Chinese Reflexive Ziji

📅 2025-04-02
🏛️ International Conference on Computational Linguistics
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically evaluates large language models’ (LLMs) ability to model long-distance binding of the Chinese reflexive *zìjǐ* (“self”), addressing the core challenge of anaphoric resolution under joint syntactic (e.g., subject prominence, governing category) and semantic (e.g., verb selectivity, noun referentiality) constraints. We construct a 640-sentence benchmark combining synthetically templated and naturally occurring sentences drawn from the Beijing Language Corpus (BCC), with native speaker judgments as the gold standard. For the first time, we conduct a cross-model evaluation of 21 prominent LLMs. Results reveal that no model replicates human judgment patterns: all exhibit systematic deficits—overreliance on surface sequential cues and failure to integrate syntactic and semantic information—particularly insensitivity to verb-based binding constraints. This exposes fundamental limitations in grammatical abstraction and semantic compositionality. Our work establishes a reproducible benchmark and diagnostic framework for reflexive modeling in Mandarin Chinese.

Technology Category

Application Category

📝 Abstract
This paper explores whether language models can effectively resolve the complex binding patterns of the Mandarin Chinese reflexive ziji, which are constrained by both syntactic and semantic factors. We construct a dataset of 240 synthetic sentences using templates and examples from syntactic literature, along with 320 natural sentences from the BCC corpus. Evaluating 21 language models against this dataset and comparing their performance to judgments from native Mandarin speakers, we find that none of the models consistently replicates human-like judgments. The results indicate that existing language models tend to rely heavily on sequential cues, though not always favoring the closest strings, and often overlooking subtle semantic and syntactic constraints. They tend to be more sensitive to noun-related than verb-related semantics.
Problem

Research questions and friction points this paper is trying to address.

Resolving binding patterns of Chinese reflexive ziji
Assessing model performance vs human judgments
Identifying reliance on sequential cues over semantics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Constructed dataset with synthetic and natural sentences
Evaluated 21 models against human judgments
Analyzed reliance on sequential and semantic cues
🔎 Similar Papers
No similar papers found.