CGELBank Annotation Manual v1.1

πŸ“… 2023-05-27
πŸ“ˆ Citations: 1
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Most existing English treebanks adopt phrase-structure grammars, limiting their utility for theory-driven syntactic modeling. To address this, we introduce CGELBankβ€”the first fine-grained, fully formalized syntactic treebank grounded in *The Cambridge Grammar of the English Language* (CGEL). Our approach features a theoretically consistent annotation schema with explicit functional hierarchies and constructional compatibility, transcending conventional category-based labeling; a dedicated annotation toolchain integrated with automated consistency verification; and a publicly released v1.1 annotation manual ensuring reproducibility and interpretability. CGELBank constitutes the first high-fidelity, computationally tractable CGEL-aligned resource, enabling rigorous integration of linguistic theory into NLP. It significantly enhances the theoretical interpretability and structural generalization capacity of syntactic models.
πŸ“ Abstract
CGELBank is a treebank and associated tools based on a syntactic formalism for English derived from the Cambridge Grammar of the English Language. This document lays out the particularities of the CGELBank annotation scheme.
Problem

Research questions and friction points this paper is trying to address.

Developing syntactic treebank based on CGEL formalism
Creating annotation tools for English grammar analysis
Documenting specialized CGELBank annotation scheme specifications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Treebank based on CGEL syntactic formalism
Open-source tools hosted on GitHub platform
Detailed annotation scheme documentation provided
πŸ”Ž Similar Papers
No similar papers found.