Learning Constituent Headedness

📅 2026-03-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a novel approach to modeling headedness in constituent syntactic trees by formulating it as an independent supervised learning task. Unlike conventional methods that rely on post-hoc rule-based heuristics—such as the Collins rules—to infer heads, this study leverages automatically induced head annotations derived from aligned dependency and constituency treebanks. By directly predicting the head of each constituent through supervised learning, the method overcomes the limitations of rule-driven paradigms and enables effective cross-lingual and cross-resource transfer. Experiments on English and Chinese benchmarks show that head prediction accuracy approaches the theoretical upper bound and substantially outperforms traditional rule-based baselines. Moreover, the predicted heads exhibit high fidelity in both constituency parsing and constituent-to-dependency conversion, significantly enhancing parsing consistency and cross-lingual transferability.

Technology Category

Application Category

📝 Abstract
Headedness is widely used as an organizing device in syntactic analysis, yet constituency treebanks rarely encode it explicitly and most processing pipelines recover it procedurally via percolation rules. We treat this notion of constituent headedness as an explicit representational layer and learn it as a supervised prediction task over aligned constituency and dependency annotations, inducing supervision by defining each constituent head as the dependency span head. On aligned English and Chinese data, the resulting models achieve near-ceiling intrinsic accuracy and substantially outperform Collins-style rule-based percolation. Predicted heads yield comparable parsing accuracy under head-driven binarization, consistent with the induced binary training targets being largely equivalent across head choices, while increasing the fidelity of deterministic constituency-to-dependency conversion and transferring across resources and languages under simple label-mapping interfaces.
Problem

Research questions and friction points this paper is trying to address.

headedness
constituency parsing
dependency parsing
syntactic analysis
treebank annotation
Innovation

Methods, ideas, or system contributions that make the work stand out.

constituent headedness
supervised prediction
dependency-con constituency alignment
head-driven binarization
cross-lingual transfer
🔎 Similar Papers
2024-02-12IEEE Transactions on Neural Networks and Learning SystemsCitations: 4
Z
Zeyao Qi
The Chinese University of Hong Kong
Yige Chen
Yige Chen
College of Computer Science and Artificial Intelligence, Wenzhou University
Networking
KyungTae Lim
KyungTae Lim
École normale supérieure
Natural Language Processing
H
Haihua Pan
The Chinese University of Hong Kong
J
Jungyeul Park
Korea Advanced Institute of Science & Technology