🤖 AI Summary
This study systematically investigates how dependency annotation schemes affect the performance of transition-based parsers.
Method: Addressing language-specific non-canonical structures in Universal Dependencies (UD) treebanks, we design standardization transformation rules and comparatively evaluate parser performance—measured by LAS and UAS—under both original and standardized annotations within a unified, multilingual evaluation framework.
Contribution/Results: We empirically demonstrate, for the first time, that annotation standardization does not universally improve parsing accuracy. Crucially, we reveal that linguistic typological features significantly moderate the effectiveness of annotation schemes: for certain languages, the original non-standard annotations yield higher accuracy than standardized ones. This finding challenges the implicit assumption that standardization is inherently optimal and underscores the necessity of considering language-specific syntactic properties when selecting or designing syntactic representations.
📝 Abstract
We compare the performance of a transition-based parser in regards to different annotation schemes. We pro-pose to convert some specific syntactic constructions observed in the universal dependency treebanks into a so-called more standard representation and to evaluate parsing performances over all the languages of the project. We show that the ``standard'' constructions do not lead systematically to better parsing performance and that the scores vary considerably according to the languages.